Machine Learning Engineer Intern
Rockfish Data delivers a high-fidelity synthetic data platform that addresses privacy, LLM training, and data sparsity challenges. As a Machine Learning Engineer Intern at Rockfish, I engineered hyperparameter tuning strategies by aligning LLM configurations with input dataset attributes to drive enhanced model performance and efficiency. I performed extensive workflow tests on diverse dataset schemas from both customer and internal sources, streamlining preprocessing by resolving errors such as missing values. Additionally, I developed an automated detection mechanism to identify whether input data is time series or tabular, which empowered the Rockfish Recommendation Engine to dynamically configure LLM parameters. By calibrating epoch counts to capture rare data categories effectively during training, I enhanced the platform's sampling efficiency. Moreover, through rigorous benchmarking against competitors like Gretel and Mostly AI, I established that Rockfish offers a 12% data fidelity advantage. These improvements collectively reduced the time-to-value for Rockfish customers by redesigning the data ingestion workflow.