Solid background in computer science with a PhD in Big Data
management on massively parallel systems.
Strong research experience with multiple publications in top-rank
conferences (according to CORE Rankings
Portal).
Advanced coding-skills in Java (10+ years), and past experience in
various languages.
Opensource advocate, PMC member (former VP) of Apache Calcite,
PMC member of Apache Hive, ASF member.
Teaching experience (200+ hours) in higher education at various
levels.
Professional experience
Staff Software Engineer
Cloudera Paris,
France
Jan 2022 - Present
Senior Software Engineer
Cloudera Paris,
France
May 2020 - Jan 2022
R&D Engineer
TIBCO Paris,
France
January 2016 - April 2020
Designed and implemented the first version of the SQL query
processor of the system using Apache Calcite.
Coordinated groups of 3-4 engineers to improve (new features &
bug fixes) the query processor.
Designed and implemented part of the new indexing framework of
the system using Apache Lucene improving the scalability of the
system by a factor of 10.
Participated in the prototyping of a multi-versioned ORM
system based on JPA and Hibernate capable of handling millions
of records and several thousand versions.
Resolved critical bugs in production involving race conditions
and memory leaks.
Collaborated with the other teams (support, UI, CI) to address
customer requests and improve the system.
Presented various R&D subjects in universities and
conferences, promoting the company.
Participated in the recruitment proccess for selecting and
hiring new engineers.
Post-doctoral researcher
INRIA Orsay,
France
November 2015 - January 2016
Devised data integration techniques for applications relying
on multiple data management systems (Postgres, MongoDB,
AsterixDB, SparkSQL, Solr) to facilitate querying and improve
performance, publishing papers in SIMGOD
(A*) and ICDE (A*).
Assisted in the implementation of Estocada (a polystore
system) by improving the query rewriting algorithm and
prototyping the connectors for Redis, Spark, and Postgres.
Co-authored a book
on cloud-based RDF data management, presenting and criticizing
the most prominent aproaches in terms of storage, query
processing, and reasoning.
Devised partitioning and query processing algorithms for
distributed architectures emphasizing on RDF data management
publishing 2 papers in ICDE (A*).
Designed and implemented CliqueSquare,
a distributed system relying on Hadoop MapReduce/HDFS capable of
answering queries 10x faster than the state-of-the-art; it
features a query/plan parser, algebraic operators, a novel
partitioner and query planner.
Devised indexes for cloud-based RDF data management systems
examining performance/storage/monetary cost trade offs
publishing the results in a book chapter.
Directed the implementation of RDF indexes in AMADA, a
data management system for RDF/XML data relying on Amazon S3,
EC2, DynamoDB, and SQS services.
Designed a system for semi-automatic fact-checking publishing
a demo paper in SIGMOD
(A*) and initiating a collaboration with Le Monde.
Designed and implemented FactMinder,
a Chrome plugin for semi-automatic fact-checking relying on
Javascript and JQuery.
Administrated the installations of Hadoop, Spark, Redis,
Postgres, and Hive in a cluster of 10 nodes.
Organized the seminars of the LaHDAK team.
Reviewed numerous research papers for ESWC (A),
SIGMOD
(A*), and ICDE (A*)
conferences.
Research assistant
INRIA Orsay,
France
September 2010 - July 2011
Devised a novel data model, XR,
for naturally representing documents (XML) with annotations
(RDF) publishing a paper in VLDB
Journal (A*).
Implemented a system prototype for XR by extending ViP2P, a
P2P XML data management system implemented in Java, with RDF
support and a bind join algorithm.
Designed and implemented in Java a customizable generator for
RDF data, capable of generating millions of RDF triples in only
a few seconds, which was used for benchmarking the XR system.
Research assistant
FORTH-ICS Heraklion, Greece
July 2008 - September 2010
Devised techniques for the visualization and exploration of
RDF graphs, publishing a demo in ESWC (A)
and a journal in JVLC
(A).
Implemented some of the major features of StarLion, an
RDF visualization application written in Java (relying on
JGraph), such as smooth transitions, semi-automatic layout,
plugin support, force indicators, node colouring, save/load
layouts, undo/redo actions, image exporting ), which according
to user studies greatly improved the usability of the tool.
Improved the performance of StarLion (through caching, better
data structures, and smarter algorithms) reducing response time
by a factor of 100.
Designed, implemented, and curated the Website of StarLion.