Wrote performant Storm topology to stream high-volume data (roughly 7 - 10k rows per second) from Oracle to HDFS in real-time. (some portions are open source now, available at http://github.com/linkshare )
Wrote Flume sinks to geocode and stream click data from Apache logs to Hive tables.
Wrote unit and functional tests for Storm topology components, included Oracle mock using in-memory H2, and Cucumber JVM feature files for integration and functional tests.
Big Data TechCon 2014 Certified.
------ DevOps Engineer ------
Wrote Chef cookbooks, Python/Shell scripts to fully automate the provisioning of a complex, interdependent 100-terabyte CDH cluster
Built custom CSDs and Cloudera Parcels for Redis, Storm and Pentaho BI server. (open source now, available at http://github.com/linkshare )
Automated deployment of Oozie workflows and HDFS scripts using Jenkins and Git hooks
Automated SOPs to manage/operate Storm streams using Jenkins.
Exposure to Lambda Architecture, with focus on real-time layer and streaming
Integrated Storm with Logstash.
Wrote many Chef cookbooks to provision various application servers both on AWS and data centers. Leveraging advanced concepts such as encrypted data bags, Chef Search API, LWRPs, Berkshelf ...etc
------ Back-end/Full Stack Developer ------
Strictly-compliant RESTful API using PHP/Symfony
Strong exposure to MongoDB, including concepts such as sharding and replica sets.
Node.JS real-time data streaming application, capable of streaming up to 5k records/second
CAS Server and Shibboleth IDP3 integration for SAML2 SSO implementation.