Data Engineer - Big Data

Farfetch (NYSE: FTCH)

  • Location
    • Porto
  • Date Posted
  • 16 Sep 2020
  • Function
  • Data Science
  • Sector
  • Retail

Farfetch exists for the love of fashion. We believe in empowering individuality. Our mission is to be the global technology platform for luxury fashion, connecting creators, curators and consumers.

Technology

We’re on a mission to build the technology that powers the global platform for luxury fashion. We operate a modular end-to-end technology platform purpose-built to connect the luxury fashion ecosystem worldwide, addressing complex challenges and enjoying it. We’re empowered to break traditions and disrupt, with the freedom and autonomy to make a real impact for our customers all over the world.

Porto

Our Porto office is located in Portugal's vibrant second city, known for its history and its creative yet cosy environment. We welcome new ideas and a large number of our people. From Account Management to Technology and Product, whatever your skills are, you'll find your fit here. You can have an informal meeting in the treehouse or play the piano in your lunch break!

The role

We are looking for a person who will be part of our Business Intelligence & Analytics team, this position will be in charge of the development of high performance, distributed computing tasks using Big Data technologies such as Hadoop, NoSQL and other distributed environment technologies based on the needs of the organization. You will also be responsible for analyzing, designing, programming, debugging and modifying software enhancements and/or new products used in distributed, large scale analytics solutions.

What you'll do

  • Design and develop highly scalable, end to end process to consume, integrate and analyze large volume, complex data from sources such as Hive, Flume, Kafka or Storm;
  • Provide Data Engineering expertise to multiple teams across our organization. Provide guidance and support to software engineers with industry and internal data best practices;
  • Build fault tolerant, adaptive and highly accurate data computational pipelines. Tune queries running over billion of rows of data running in a distributed query engine;
  • Research and implement new data technologies as needed;
  • Work with other teams to understand needs and provide solutions;
  • Find innovative solutions through a combination of creative thinking and deep understanding of the problem space;
  • Work with the Business Intelligence development team on migration and improve existing SQL Server-based ETLs to Map Reduce and Hive (Cloud) technology to achieve scale and performance;
  • Help define and implement new processes on the data warehouse platform and work closely with Data Scientists to transform big data into model-­‐ ready forms to support analytic projects.

Who you are

  • Experienced in working with large data sets (both structured and unstructured) using technologies such as MapReduce, Hadoop, HBase, Hive, Spark and NoSQL technologies;
  • Strong at programming background with languages such as Java, C++, or Python;
  • Knowledge in distributed systems;
  • A professional with background in working in cloud environments – AWS, Rackspace, Azure, etc;
  • Experienced with real-time analysis of sensor and other data from Internet of Things (IoTs) or other connected devices is a plus;
  • Excellent in grasping of algorithmic concepts in computer science (e.g., sorting, data structures, etc.);
  • Experienced in the design, development and release of enterprise scale applications;
  • Experienced with version control;
  • A team worker with analytical and creative problem solving abilities.

Responsible for development of high performance, distributed computing tasks using Big Data technologies such as Hadoop, NoSQL and other distributed environment technologies based on the needs of the organization.