Shortcuts:

IMAGE: Return to Main IMAGE: RSS Feed IMAGE: Show All Jobs

Position Details: Solution Architect - Agentic AI Data Engineering Artificial In

Location: Atlanta, GA
Openings: 1

Description:

Role: Agentic AI Data Engineer
Location options: San Francisco Bay Area, New York / New Jersey, Atlanta, Chicago, and Dallas. 

Preface
The Agentic AI Data Engineer is a hands-on role focused on building and maintaining the data pipelines and infrastructure that fuel AI agent systems. Within TCS’s AI & Data group (Americas), you will be the builder who turns data architecture plans into reality, ensuring that AI models and agents have continuous access to high-quality, timely data. This client-facing consulting role involves hybrid work from TCS hubs with travel to client sites as needed for deployment. You’ll work across industries, handling a wide array of data: financial transactions for BFSI AI solutions, sensor and machine data for Manufacturing AI, patient or research data for Life Sciences AI, and more. By combining expertise in data ingestion, transformation, and integration with knowledge of AI data needs, you will play a critical part in enabling AI agents to perform reliably and accurately in production.

What You Would Be Doing:
•Build Data Ingestion Pipelines: Develop robust pipelines to extract data from various sources (databases, APIs, flat files, streaming sources) relevant to the AI solution.
•Data Transformation & Processing: Implement transformation and cleaning steps on raw data to make it usable for AI, ensuring efficiency and scalability.
•Loading Data to Storage/Indices: Set up processes to load processed data into target storage systems that AI agents or models will use.
•Real-Time Data Feeds: Implement streaming or incremental update pipelines when AI systems require real-time or frequently updated data.
•Pipeline Automation & Scheduling: Use orchestrators or schedulers to automate the data workflows.
•Data Integration & API Development: Develop and maintain integration components for real-time data fetching.
•Collaborate on RAG/Knowledge Base Updates: Work closely with AI Data Architects on implementing RAG updates.
•Testing and Validation of Data Pipelines: Develop tests and monitoring for your data pipelines.
•Optimize Pipeline Performance: Profile and optimize data pipelines for speed and resource usage.
•Documentation and Handover: Document pipeline processes, configurations, and dependencies clearly.
•Industry-Specific Data Handling: Adapt data engineering to specific domain needs.
•Collaboration & Agile Implementation: Work as part of an agile product team, collaborating with data architects, AI engineers, and others.
•Maintain and Evolve Pipelines: Monitor pipelines and handle maintenance post go-live.

What Skills Are Expected:
•Programming & Scripting: Strong programming skills, especially in Python, and experience with other languages like SQL.
•Data Pipeline Development: Practical experience building data pipelines end-to-end.
•Database and SQL Skills: Proficiency in writing and optimizing SQL queries.
•Big Data & Distributed Processing: Experience with big data technologies like Apache Spark.
•Streaming Data Experience: Familiarity with streaming frameworks and tools like Kafka.
•API Integration and Web Services: Ability to interact with web APIs for data ingestion or extraction.
•Data Formats and Parsing: Strong understanding of data formats and ability to parse JSON, XML, or custom text formats.
•DevOps for Data Pipelines: Basic DevOps skills, including using Git for version control and CI/CD pipelines.
•Problem Solving & Debugging: Strong ability to troubleshoot data issues.
•Data Quality Focus: Attentiveness to data quality and skills in implementing checks and validating outputs.
•Collaboration & Communication: Good communication skills to work with the team and clients.
•Time Management & Flexibility: Ability to handle multiple tasks and prioritize effectively.
•Domain Data Understanding: Aptitude to learn domain context from data.
•Security & Privacy Business Units: Understanding of handling sensitive data securely in pipelines.
•Continuous Learning: Willingness to learn new tools or frameworks as needed.

Key Technology Capabilities:
•ETL / Data Integration Tools: Experience with tools such as Apache Airflow, Informatica PowerCenter, or cloud-based ones like Azure Data Factory.
•Big Data Processing: Proficiency in Apache Spark and knowledge of Hadoop HDFS.
•SQL & Databases: Strong practical SQL skills and familiarity with relational database systems.
•NoSQL and Other Data Stores: Knowledge of specific systems like MongoDB or Cassandra.
•Stream Processing: Hands-on usage of Apache Kafka and understanding of consumer group mechanics.
•Cloud Storage & Compute: Familiarity with cloud storage services like Amazon S3 and cloud compute for ETL.
•APIs & Web Services: Experience building or using connectors to RESTful APIs.
•File Formats & Data Serialization: Understanding of various file formats and ability to convert between them.
•Operating Systems & Scripting: Comfortable with Linux shell and basic shell scripting.
•Version Control & CI/CD: Using Git for source control and setting up CI pipelines for data projects.
•Monitoring & Logging Tools: Utilizing monitoring tools for data workflows.
•Data Visualization/Verification: Basics of tools like Excel or Python’s Jupyter notebooks for data sanity checks.
•Security & Networking: Understanding network configurations for data transfer.
•Testing Frameworks: Familiarity with PyTest or unittest for writing tests for data transformations.
•Collaboration Tools: Experience with tools like JIRA and documentation tools.
•AI/ML Familiarity: Bonus if you understand some AI/ML fundamentals.

Salary Range: $127,500 - $172,500 a year

#LI-AD1



 

Perform an action:

IMAGE: Apply to Position




Powered by: CATS - Applicant Tracking System