Location: Houston, TX

Post Date: May 5

Employment Type: IT Contract To Hire

Reference Code: #6119

Job Description



  • Research and analyze emerging technologies, platforms, tools and solutions; map new and existing business opportunities to potential big data, intelligence and analytics solutions
  • Capture business/functional requirements, expected service levels and user experience requirements from business units
  • Provide recommendations, technical direction and leadership for the selection and incorporation of cloud based big data warehouse and data lake solutions and business applications
  • Develop effective phased roadmap and practical architectures for deploying modern enterprise data warehouse and data lake platforms for business intelligence, advanced analytics, machine learning and  data science applications
  • Plan, architect and build next-generation enterprise data lake and analytics applications using the Hadoop platforms (ecosystem technologies) like Azure HDInsight, Cloudera, Hortonworks, MAPR
  • Develop highly scalable, extensible and reliable big data solutions that enable collection, storage, modeling, and analysis of large structured / un-structured datasets from multi-channel sources
  • Develop and maintain processes to acquire, analyze, store, cleanse, and transform large datasets using tools like Spark, MapReduce, Kafka, Sqoop, Hive, NiFi, HBASE, YARN etc.
  • Apply strong expertise on ETL architectures, data movement technologies, data cleansing techniques, optimization of datastores, and building communication channels between structured and unstructured databases to support solution operationalization
  • Develop and maintain enterprise data standards, quality guidelines, best practices, security policies and governance processes for the Big-Data / Hadoop ecosystem
  • Apply strong understanding of cloud or on-premise infrastructure architectures, environments, constraints, and available options for big-data warehouses and data-lakes; develop solution deployment roadmap and architectures with cloud/infrastructure teams



  • 10+ years of hands-on experience in architecture, design or development of enterprise data solutions, applications, and integrations
  • 4+ years of demonstrated experience in architecture, data modelling and implementation of large, scalable, and highly complex big data warehouse and data lake projects using cloud-native and open-source platforms and technologies
  • Demonstrated expertise and hands-on experience with Hadoop ecosystem platforms (like Azure HDInsight, Cloudera, Hortonworks, MAPR) and NoSQL datastores (HBase, Cassandra, Azure CosmosDB)
  • Demonstrated expertise and hands-on experience with Azure & AWS based big data technologies like Azure SQL Data Warehouse, Azure Data Lake, Azure ML Studio/Workbench, Databricks, Cortana Intelligence Suite, Redshift, RDS, Glacier, Kinesis etc.
  • Advanced level proficiency and hands-on scripting experience with Spark, Python, U/SQL etc.; Advanced level proficiency in R, PySpark, SparkR, Scala, Hive
  • Hands-on expertise working with large complex data sets, real-time/near real-time analytics, and distributed big data platforms; Experience in deploying data movement, storage and cleansing, transformation and data quality management for big data solutions
  • Excellent communication skills – writing, presentations, interpersonal skills
  • B.S. Engineering or Computer Science is preferred; advanced certifications in big data, machine learning and advanced analytical platforms etc. is desirable.