Centraprise
AI Data Linguist (Malay / European Portuguese)
Contract · In Office · Seattle, Washington (USA)
$60,000–$75,000 · Posted Jun 11, 2026
Work Options
Skills
Industry
Job Type
Position Group
- Job Description – Data Linguist (Malay / European Portuguese)
- Job Title: Data Linguist
- Job Location: Seattle, WA or Burlingame, CA
- Employment Type: Full-Time
- Position Overview
- We are seeking a highly motivated Data Linguist with expertise in Malay or European Portuguese to support the development and optimization of Large Language Models (LLMs). The ideal candidate will have a strong linguistic background combined with technical skills in data processing, scripting, and machine learning concepts.
- This role requires working with large-scale language datasets, creating linguistic rules, and collaborating with cross-functional teams to improve AI model performance.
- Key Responsibilities
- Analyze, process, and validate large volumes of multilingual language data.
- Develop and maintain hard-coded linguistic rules for Large Language Models (LLMs) using regular expressions (Regex).
- Utilize Python and related data processing tools for data manipulation and automation.
- Read, understand, modify, and adapt existing scripts and notebooks developed by peers.
- Support data preparation, annotation, and quality assurance for AI/ML projects.
- Apply knowledge of dialectal and regional language variations to improve linguistic accuracy.
- Collaborate with engineers, data scientists, and linguists to deliver high-quality language solutions.
- Adapt quickly to evolving project requirements and workflows while maintaining accuracy and efficiency.
- Required Qualifications
- Native or near-native proficiency in Malay or European Portuguese.
- Strong understanding of linguistic structures, grammar, syntax, and regional dialects.
- Experience with Python for data processing and visualization.
- Hands-on experience using Regular Expressions (Regex) to create and maintain linguistic rules.
- Understanding of the relationship between language data and machine learning models.
- Ability to work with Jupyter notebooks and command-line environments for data manipulation.
- Excellent analytical, communication, and collaboration skills.
- Ability to manage large datasets with high attention to detail.
- Fast learner with the flexibility to adapt to changing priorities and new tasks.
- Preferred Qualifications
- Experience working on NLP, Generative AI, or Large Language Model (LLM) projects.
- Familiarity with data annotation, text processing, or corpus analysis.
- Experience in AI/ML data operations or computational linguistics.
- Knowledge of additional programming or scripting languages is a plus.
- Required Skills
- Native-level proficiency in Malay or European Portuguese
- Python
- Regular Expressions (Regex)
- Data Processing & Visualization
- Machine Learning Fundamentals
- Command Line & Jupyter Notebooks
- Linguistic Analysis
- NLP/LLM Concepts
- Team Collaboration
- Problem Solving & Adaptability
Mention you found this on Data First Jobs — it helps us bring you more roles like this.
AI Data Linguist (Malay / European Portuguese)
Centraprise
Similar Other Jobs
View all Other jobs→Farm Credit Services of America
Enterprise Data Architect
New
Omaha, Nebraska (USA)
CACI International Inc
Automation Data Information Administrator
New
Washington, District of Columbia (USA)
Infinite Roar
Manager, Media Analytics
New
New York, New York (USA)
Randstad Digital Americas
Principal Data Center Power Architect
New
Raleigh, North Carolina (USA)
Randstad Digital Americas
Measurement & Analytics Tier 2
New
Tampa, Florida (USA)$1,000 - $1,000
Noble Corporation
Senior Performance Data Specialist
New
Houston, Texas (USA)
Like this role? Get carefully selected jobs like it, twice a week, straight to your inbox.
Free, no spam. Unsubscribe anytime.