Data Engineer
PulsePoint
PulsePoint is a digital media company headquartered in NYC that’s looking to grow the engineering team supporting our Exchange by adding engineers who are: smart, get things done, communicate well, understand computer science, take pride in engineering, and want to work in an interesting and rapidly evolving industry. We have an experienced team that is looking to add developers seeking technical challenge and empowerment to make decisions that really move the business forward.
Some major projects the Exchange team is currently working on: continuing to scale our core exchange platform, honing the intelligence of our optimization, cutting feedback time for business intelligence, improving fraud protection, and aggressive automation. Currently, the PulsePoint Exchange:
- handles hundreds of thousands of transactions per second, billions of times each month
- conducts real-time ad-serving based on statistical and machine-learning models
- returns responses collected from dozens of parties in milliseconds
- constantly evolves to a market that changes in days and weeks, not months and years
- incorporates thousands of data-points into every serving decision
If working in an agile shop with a team of top-flight engineers growing the system described above sounds interesting, some things that you should know:
Summary
As a data engineer at Pulsepoint, you'll support data infrastructure services, like hadoop, rdbms, and nosql platforms, from installation and configuration to maintenance and curation. You'll be a technology evangelist, helping people throughout the organization use data to make informed decisions, draw accuracte conclusions, and otherwise do amazing things by providing tools and support. You'll work closely with analysts, data scientists, and devs to make data processing transparent and provide data services that help drive and support business goals.
Responsibilities
- Change, release, and configuration management of data infrastructure services
- Measure and maintain uptime and health of data infrastructure services
- Monitor and provide transparency into data quality across systems ( accuracy, consistency, completeness, etc )
- Increase accessibilty and effectiveness of data ( work with analysts, data scientists, and devs to build/deploy tools and datasets that fit their use cases )
- Design and implement retention, backup, and recovery policies and procedures
Required Skills
- Proficiency in at least one scripting language ( e.g. python, perl, shell, ruby ) - you can quickly put together a script to automate a task or parse text
- Familiarity with core linux utils - you're not afraid of the command line and can string together some one-liners to find your answer
- Experience with at least one version control system ( e.g. svn, git )
- Intermediate linux sysadmin skills - A daemon begins failing health monitors without going down; you can troubleshoot the operating system to see what's going on.
Appreciated Skills
- SQL - You'll need to transform data and support SQL-speaking analysts.
- Demonstrable OOP experience - You'll need to build and support libraries and tools for interacting with or administering data
- Experience/Interest in visualizations - You recognize that visualizations are crucial for extracting meaning from various datasets and have a preferred tool for charting a time series
- Experience/Interest in stats / analytics
More about working in Technology at PulsePoint:
We’re small enough you can own something and have a direct impact, but big enough that you don’t have to go it alone. We care deeply about quality and doing the right thing, but have a strong focus on business value and time to market – and believe that focusing on the first part enables the second. Tech staff have technical management (who are technical and write code), as well as direct access to business, product, and operations (and they have access to us). Lastly, our Sr. Engineers have lots of empowerment and freedom of action (but we don’t water down our responsibilities or expectations)
- Tools we use: SQL Server, MySQL, Vertica, Hadoop/Pig/Hive, SAS, Qlikview
- Practices we’ve adopted: TDD/unit-testing, continuous integration, code-reviews, Scrum
- Current Projects: cloud-computing, event-driven IO, self-healing systems, analytic s
- We like open source: Spring, Hadoop (we run the NYC Hadoop Meetup), MySQL, Glassfish, Linux, Memcache
- We have rules around meetings – to keep them short and useful
- Tech staff get fast boxes, with multiple monitors, and can choose Windows or Linux
- We keep a library of technical books (several hundred)
- We have the usual coffee, tea, sodas, snacks, socials and events
We also believe that good balance outside work produces good work, so have:
- Sane work hours (with flexible scheduling)
- Real salaries, real benefits
- Decent paid vacation (which your manager will make sure you actually take)
New York City
Full Time