Responsibility Summary: • 7+ years of experience required • Mandatory skill: Databricks Engineer • Create and maintain optimal data pipeline architecture capable of ingesting structured and unstructured data in near real-time • Assemble large, complex data sets that meet functional and technical requirements • Identify, design, and implement internal process improvements: Automating manual processes, optimizing data delivery, redesigning infrastructure for greater scalability, etc. • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources • Develop data flows that can leverage both on premise and cloud architectures • Build KPIs that utilize the data pipeline to provide insights into user adoption, operational efficiency, and other key business performance metrics • Work with stakeholders including the business analytics teams and IS architecture teams to assist with data-related technical issues and support their data infrastructure needs • Work with data and analytics experts to strive for greater functionality in our data systems and assist in data transformations needed for BI/ analytics use Problem Solving: • Strong analytical skills are necessary to research technical issues and recommend potential solutions for leadership Decision Making: • Medium. Must be able to apply judgment within the scope of defined processes • Individuals are expected to resolve issues while meeting the needs of the business or associate • Individuals are expected to perform daily tasks autonomously based on a framework of documented policies and procedures, are responsible for the accuracy and timeliness of their work • Critical decisions require coordination, and leadership review prior to implementation • Impacts to decisions may be widely visible (e.g., system outage/access issues), but short in duration • Management/Supervisory responsibility do not apply Qualifications Required in the Job: • Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL), as well as working familiarity with a variety of databases structures (MongoDB, DataVault) • Experience building and optimizing data pipelines, architectures, and datasets • Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement • Strong analytic skills related to working with unstructured datasets and recommending transformations to optimal architectures • Build processes supporting data transformation, data structures, metadata, dependency, and workload management • A successful history of manipulating, processing, and extracting value from large, disconnected datasets • Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores • Strong project management and organizational skills • Experience supporting and working with cross-functional teams in a dynamic environment • Candidates should also have 7+ years of experience using the following software/tools: Spark, Kafka, etc. • Experience with relational SQL and NoSQL databases • Experience with AWS/Azure cloud services • Experience with stream-processing systems: Kafka, Spark-Streaming, etc. • Experience with object-oriented/object function scripting languages: Python, Java, etc.
Job Type
Fulltime role
Skills required
NoSQL, Python, Java
Location
Dallas, Texas
Salary
No salary information was found.
Date Posted
April 27, 2025
Kaygen, Inc. is seeking a Databricks Engineer with over 7 years of experience to design and maintain data pipeline architectures. The role involves optimizing data delivery and collaborating with cross-functional teams in Dallas, Texas.