About me

Hi. I am an aspiring data scientist.

my_dict = {
    "name": "Kathy Tran",
    "Passion": "Data Science",
    "Projects":
}

My skills

  • visualization icon

    Generative AI with RAG and LLM

    - Develop Retrieval-Augmented Generation (RAG) pipelines that combine vector databases and LLMs to deliver context-aware, domain-specific answers.
    - Fine-tune Large Language Models (LLMs) and optimize prompt engineering to improve accuracy, reduce token usage, and enhance generation quality.
    - Integrate open-source LLMs (e.g., LLaMA 2, Mistral) or API-based models (e.g., GPT-4o) into applications for conversational AI, document Q&A, and knowledge assistants.

  • machine learning icon

    Machine Learning

    - Utilize predictive modeling techniques to identify patterns for data-driven decisions using Python Scikit-learn

    - Classification: Logistic regression, K-nearest neighbors, Decision Tree, Random Forest, Gradient Boosting Regression Trees

    - Quantitative: Linear regression

    - Unsupervised: K Means clustering

    - Time series: Long short-term memory, ARIMA

  • data collection icon

    Data collection

    - Utilize APIs to collect and aggregate data for analysis.

    - Perform web scraping with Python BeautifulSoup and Selenium WebDriver to gather information

  • Data wrangling

    Data wrangling

    - Clean and organize raw data with Python Pandas and Numpy, ensuring high-quality datasets for analysis.

    - Utilize SciPy for solving linear equations & statistical analysis to preprocess complex datasets.

  • database icon

    Database management

    - Manage and maintain databases using MySQL, PostgreSQL, or MongoDB (NoSQL)

    - Design databases with Entity-relationship Diagrams (ERDs) & logical modeling

    - Migrate from on-premises / local data centers to the Cloud with AWS, Microsoft Azure or Google Cloud

  • visualization icon

    Data Visualization

    - Create visualizations with Python Matplotlib and Seaborn

    - Develop interactive dashboards with Tableau & Power BI to effectively communicate insights.

Portfolio

Achievements

Awards

Scroll right to see more

  • KTHack

    Second Place Overall - KT Hack

    Web scraper

    Used Python BeautifulSoup for web scraping to find the data regarding investors' trades since they started stock trading. React.JS to build a website that retrieves the data when the user types in the investor's name. Node.js for web routing and API handling.

    See Devpost

  • Hack RU

    Best Sustainability University Hack - Hack RU

    Front-end Engineer

    We used NodeMailer to send weather alerts. Our website allows users to browse potential hazards in a given location by entering it in a form, which then displays a map indicating danger areas. We utilized React.JS for the front-end and Express.JS for the back-end.

    See Devpost

  • Hack TCNJ

    Best Dot Tech Domain Website - Hack TCNJ

    Machine Learning Engineer

    We used Django and Auth0 for user authentication and data processing. Python Tensorflow, PyTorch, and Keras to build an image recognition model to classify the type of trash into 4 categories, allowing users to recycle them accordingly.

    See Devpost

  • NYU

    Dean's Honors List - New York University

    The Dean's List is an academic honor awarded to undergraduate students achieving high scholarship each academic year.


    • Students must be matriculated undergraduates achieve a GPA of 3.7 or higher in each term