Data First Jobs

Centraprise

AI Data Linguist (Malay / European Portuguese)

Contract · In Office · Seattle, Washington (USA)

$60,000–$75,000 · Posted Jun 11, 2026

Work Options
Skills
Job Type
Position Group
  • Job Description – Data Linguist (Malay / European Portuguese)
  • Job Title: Data Linguist
  • Job Location: Seattle, WA or Burlingame, CA
  • Employment Type: Full-Time
  • Position Overview
  • We are seeking a highly motivated Data Linguist with expertise in Malay or European Portuguese to support the development and optimization of Large Language Models (LLMs). The ideal candidate will have a strong linguistic background combined with technical skills in data processing, scripting, and machine learning concepts.
  • This role requires working with large-scale language datasets, creating linguistic rules, and collaborating with cross-functional teams to improve AI model performance.
  • Key Responsibilities
  • Analyze, process, and validate large volumes of multilingual language data.
  • Develop and maintain hard-coded linguistic rules for Large Language Models (LLMs) using regular expressions (Regex).
  • Utilize Python and related data processing tools for data manipulation and automation.
  • Read, understand, modify, and adapt existing scripts and notebooks developed by peers.
  • Support data preparation, annotation, and quality assurance for AI/ML projects.
  • Apply knowledge of dialectal and regional language variations to improve linguistic accuracy.
  • Collaborate with engineers, data scientists, and linguists to deliver high-quality language solutions.
  • Adapt quickly to evolving project requirements and workflows while maintaining accuracy and efficiency.
  • Required Qualifications
  • Native or near-native proficiency in Malay or European Portuguese.
  • Strong understanding of linguistic structures, grammar, syntax, and regional dialects.
  • Experience with Python for data processing and visualization.
  • Hands-on experience using Regular Expressions (Regex) to create and maintain linguistic rules.
  • Understanding of the relationship between language data and machine learning models.
  • Ability to work with Jupyter notebooks and command-line environments for data manipulation.
  • Excellent analytical, communication, and collaboration skills.
  • Ability to manage large datasets with high attention to detail.
  • Fast learner with the flexibility to adapt to changing priorities and new tasks.
  • Preferred Qualifications
  • Experience working on NLP, Generative AI, or Large Language Model (LLM) projects.
  • Familiarity with data annotation, text processing, or corpus analysis.
  • Experience in AI/ML data operations or computational linguistics.
  • Knowledge of additional programming or scripting languages is a plus.
  • Required Skills
  • Native-level proficiency in Malay or European Portuguese
  • Python
  • Regular Expressions (Regex)
  • Data Processing & Visualization
  • Machine Learning Fundamentals
  • Command Line & Jupyter Notebooks
  • Linguistic Analysis
  • NLP/LLM Concepts
  • Team Collaboration
  • Problem Solving & Adaptability

Mention you found this on Data First Jobs — it helps us bring you more roles like this.

AI Data Linguist (Malay / European Portuguese)

Centraprise

Like this role? Get carefully selected jobs like it, twice a week, straight to your inbox.

Free, no spam. Unsubscribe anytime.