Facebook contacted me on LinkedIn recently looking to fill a 'Production Engineer' role. I wasn't looking for job offers. They reached out to me. Apparently they had been doing this a bit though, targeting DevOps professionals as my buddy at the startup I was recently working with also got contacted.
Having Facebook reach out to you for a job opportunity? Definitely intriguing. One of the things on my 'IT Career Bucket List' would be to work at one of 'those' companies. Google, Facebook, etc. I thought if nothing else, it would be interesting to see where the hiring process went, since in the past they didn't actively recruit unknowns like me.
I replied that I'd be interested to know more and so a phone interview was arranged with the Facebook technical recruiter. At the appointed time (a couple days later) he called and I got the skinny on how things would potentially work.
The phone interview was the first step. After that I provide them with my resume which some of their technical leads would look at. If they felt I was a fit based on my resume, there would be two remote technical screens - 1 specifically focussed on the Linux OS, and the other on a coding language of my choice. They wanted someone who was very comfortable at both. Over the phone they gave me some sample questions - basic linux commands - to give me an idea of what the screens would be like. If I managed to gain their approval in the technical interviews, Facebook would then fly me to the location of my choice (Menlo Park or Seattle) for face to face talks - both technical and otherwise. Following that, if I was still up to snuff, I'd get an offer. Once hired, there would be a 6 week on site 'boot camp' where I'd get trained in all things Facebook and brought up to speed the technical ins and outs of the team I'd be working with.
Getting hired would require me to move to Seattle or Menlo Park. Real estate at both locations is exorbitant - like ridiculous. In the event that I was hired, Facebook would offer me a full relocation package, with the potential for a temporary housing situation for 3-4 months while we looked for a permanent residence. I was told that many employees commute from communities with more reasonably housing prices using Facebook commuter busses that have complimentary drinks and WIFI. Health and Dental would be covered 100% for myself and my dependents. Wednesdays are an optional work-from-home day, and Facebook offers 21 days of vacation per year - although I neglected to confirm whether that was 21 business days (over 4 weeks), or 3 weeks all in. I'm a Canadian, so I asked about a work Visa. He replied that Facebook has an immigration/legal department that would arrange a TN1 Visa for me, in the event that I was hired.
As far as the job itself - Facebook's version of the Production Engineer role is essentially described here. At the time I was talking with them, Facebook had 42 teams each responsible for a particular feature in their system (FB messenger, Ads, Newsfeeds, etc.). These teams consist of 4-5 developers with an embedded production engineer. Everyone on the team goes on call, one week at a time, so it ends up being a 4-5 week on-call rotation.
In the end, after viewing my resume they decided not to pursue the hiring process with me further at this time. They had been interviewing 'a lot of strong candidates recently that they felt were a stronger match for their immediate needs.' I can't say I was heartbroken. It would have been a big move for us that would have put pressure on me personally and financially - not to mention having one dependent in university and one in high school.
I have been pondering what it was about my resume that flagged it to the hiring managers. Was it the fact that I've moved jobs every 2-3 years, and they had concerns that I wouldn't last long at FB? Perhaps it was because of my lack of focus in technology - moving contracts 2-3 years means learning lots of new technology and never getting a chance to really focus. I've asked the FB recruiter - we'll see what he comes back with. (You might be wondering why I've moved jobs every 2-3 years... Its prudent for independent business contractors like myself to 'keep moving' from a Canadian taxation perspective.)
Friday, July 6, 2018
Saturday, September 9, 2017
Its been a while since I've tried to get any kind of industry certification. Life was busy. I was full-time consulting and had clients to support after work as well. Why would I even bother with all the study? Does a certification amount to much anymore? Lately, several things converged to motivate me to write this cert exam...
- My resume was getting stale. While I was able to keep a steady stream of contracts going, many of them were using older technologies that weren't current. This concerned me.
- I got a new gig at a start-up that gave me a hands-on opportunity to work with newer technology in the cloud. After setting up the infrastructure and CI/CD for several QA environments and geo-load balanced UAT and PRD environments in the Google Cloud Platform, a requirement for hosting our data in Canada became paramount. AWS had recently launched data centres in Canada, and so in May we decided to migrate our infrastructure there. Having successfully completed that migration I wondered how easy it would be for me to follow up with a Solutions Architect certification.
- A couple of guys at work got me turned on to udemy.com. Catching some sales I was able to get a couple of courses on AWS certification for $15 each (regularly they are over $150 each). One course was by ACloudGuru for the AWS SysOps certification. The other was by the Linux Academy for AWS Solutions Architect certification. These two courses gave me a good foundation for the material that is covered on the exam.
I originally did the SysOps course first, as I thought the that course was more in line with my job description at work. Finishing that course (leaving its practice test for later), I took a look at the practice questions on AWS here and felt like I'd come up short if I wrote the exam. That's when I decided to switch gears and do the Solution Architect course and exam. Most people online recommend doing that one first.
Incidentally, both courses helped me get a better grasp AWS Best Practices, and I was able to implement several improvements to our infrastructure at work because of what I learned. After another week and a half of going through the Solutions Architect course and reviewing the material, I felt more confident. I took the practice tests from both courses and passed them with good marks, so I thought I was ready. I scheduled my exam for the next week and also, for good measure, purchased a 20 question 'official practice exam'.
I had the week off during which I was scheduled to write my exam. Thursday was the big day. Tuesday morning I wrote the 'official practice exam' which was full of scenario based questions - and got 60% - A FAIL! Apparently the passing mark floats a bit between 62 and 66% (go figure). The practice exams in the courses seemed a bit easier. One of the things I didn't realize was how the different 'domains' for the exam were weighed:
In my 'official practice exam' I had scored:
1.0 Designing highly available, cost-efficient, fault-tolerant, scalable systems: 50%
2.0 Implementation/Deployment: 100%
3.0 Data Security: 75%
4.0 Troubleshooting: 50%
Clearly my 100% in implementation/deployment wasn't going to help me much with the domains balanced like that. I had some work to do!
A Cloud Guru has a forum specifically for the exam. I poured of that, specifically looking at posts where people who've written the exam share their experiences and what they wished they would have studied. I made notes of what I didn't know from those posts, and I also went over a lot of official AWS documentation for specific services (mostly FAQs and Best Practices) and took notes from them, too. And then I studied hard. 6 pages of 8 point font text. I kept adding things to those notes as well.
Thursday rolled around and I wrote the exam. I tried my best. The exam is 55 multiple choice questions and you have 80 minutes to complete it. Some of the questions I knew the answer, hands down. Others (like 'choose two correct answers') got a bit dicey. I chose the answers that made the most sense to me. I didn't feel like I was blind-sided by any of the questions, however there were definitely some things I (still) could have studied more (AD/User Federation types, for example). I finished answering the questions with about 15 minutes to spare. I had flagged some questions, so I went back and reviewed them and changed a couple of answers. When all was said and done, I passed and scored 72% - not great, but good enough for a certification.
Here's how things panned out:
Here's how things panned out:
1.0 Designing highly available, cost-efficient, fault-tolerant, scalable systems: 75%
2.0 Implementation/Deployment: 80%
3.0 Data Security: 55%
4.0 Troubleshooting: 80%
I was quite happy with the improvements I'd made in the first and last domains.
Apparently there is quite a bit of content cross-over between the AWS Solution Architect - Associate exam and the other two associate certifications: SysOps and Developer. Potentially I could study a bit and write those fairly soon. However its $150 USD to write the exams, and I'm wondering what difference is in having the 3 certs verses just the 1. I'd be interested to hear your thoughts. Would it be work the extra $300 USD? Personally, I'm content to take a break from studying for now
Saturday, August 20, 2016
Hiccups I ran into:
- I can't dynamically add cpu or memory. I've got to stop the instance and restart it. Same with disk space. I'd also like to be able to make some of my disks smaller, but I can't seem to do that either without some kind of reboot.
- If an instance has a secondary SSD drive, it seems that there are issues with discovering it on a 'clone' or a restart after a memory or CPU change. I have to log in through the back end (fortunately GCP provides an interface for this) and comment out the reference to the secondary drive in the /etc/fstab file to get it going again. This seems buggy.
I recently spent a couple of days configuring Google Stackdriver Monitoring this week, and after a
Installing agents is definitely the way to get going. I found that it was easiest to create Uptime Monitors and associated Notifications at the instance level, on a server by server basis. Doing it this way allowed me to group the Uptime Monitors to the instance, something I couldn't do when I created the Uptime Monitors outside of the instance. It was simple to monitor external dependency servers that we don't own, but need running for our testing. I integrated my Notifications with HipChat in a snap. I also installed plugins for monitoring Postgres and MySql - these worked great so long as I had the user/role/permissions set correctly. I'm super impressed with Stackdriver Monitoring, and will probably use it even if we have to switch over to host with AWS.
Our biggest roadblock with the Google Cloud Platform currently is they don't have a datacentre in Canada. That could be a big selling feature for many of my clients, and because they don't have it, we may have to consider other options (like AWS/EC2, which is spinning up a new data centre in Montreal towards the end of 2016...) Privacy matters! Hope you're listening Google...
Thursday, April 7, 2016
|The Amazon Web Services Management Console|
About 6 weeks ago I had purchased Amazon Web Services IN ACTION, written by Andreas and Michael Wittig and published by Manning. It gave me a great primer on setting up my AWS account, my Billing Alert, and my first couple of VM's. I leveraged that experience, crossed my fingers, spun up 18 VM's for my students, and hoped I wouldn't get charged a mint for having them running 24/7 for a few days. It was a Friday around 1pm when I created them and gave them to my students to use. Imagine my surprise when I checked my billing page in AWS on Monday and discovered they had only charged me $2.81!
On the last day of class, I split the students up into two teams and gave them large web projects to do, and spun up a 'T1 Micro' for each team. I gave them some scripts they could run to create some swap memory if they needed it. They ran those scripts right away, and within an hour (with 9 people uploading files and content into those systems continuously) those T1 Micros CPUs pinned. So I quickly imaged them over lunch and spun up an 'M3 Large' VMs (2 CPU's, 7.5 GiB memory) for each team and threw their images on. I ran into one issue spinning up the new team VM's - I had to spin down the original Team VM's first before I could start the new ones because there is a limit/quota of 20 VMs on the 'free-tier' in AWS. Aside from that and the changed IP addresses, the transition was seamless. I was a happy camper and now that the students had responsive VM's, so were they.
My total bill for the week - $38! A colleague pointed out that I probably could have added auto-scaling to those team VM's and increase the CPU and memory in place, without losing the current IP's. He's right, I probably could have, but I didn't have the experience and didn't want to waste class time (and potentially lose the student's work) by trying something I didn't know how to do. All in all, I was very impressed with my first run of EC2 in AWS. It was very reasonable, responsive, and easy to use. I'd definitely do it again.
Monday, March 28, 2016
Open-Source Software is the backbone of most of the internet. Seriously. For more than a decade, small business websites and main-stream web applications have used open-source software to develop their solutions. Web projects (and frankly, most software projects in general) that don't have a dependency of some kind on an open-source component are the RARE exception to the rule.
Should we be concerned about this? I think so. Here's why:
Should we be concerned about this? I think so. Here's why:
- Coders aren't implementing open-source code properly. In my experience, any open-source code dependencies should be referenced locally. Many coders fail to do this and reference external code libraries in their code. What happens when that external 'point of origin' has a DNS issue, or is hit with a DDOS attack, or is just taken down? Your site can break.
Case in point - check out this article on msn.ca 'How One Programmer Broker the Internet' In a nutshell, one open-source programmer got frustrated with a company over trademark name issue. This developer's open-source project had the same name as a messaging app from a company in Canada. He ended up retaliating by removing his project from the web. It turned out his project had been leveraged by millions of coders the world over for their websites, and once his code was removed, their websites displayed this error:
Another case in point - I worked on a project a number of years ago that had dependencies on jakarta.org - an open source site at the time that was hosting a bunch of libraries and schemas for large open-source projects like Struts and Spring. Some of the code in those projects had linked references to schemas (rules) that pointed to http://jakarta.org. Unfortunately we didn't think of changing those links and one day jakarta.org went down.... and with it went our web site, because it couldn't reference those simple schemas off of jakarta.org. After everything recovered we quickly downloaded those schemas (and any other external references we found) and referenced them locally.
- Security Do you know what is in that open-source system you're using? Over 60 million web sites have been developed on an open-source content management system called WordPress. Because it is open source, everyone can see the code - all the code - that it is built with. This could potentially allow hackers to see weaknesses in the system. However, WordPress has pretty strict code review and auditing in place to ensure that this kind of thing doesn't happen. They also patch any issues that are found quickly, and release those patches to everyone. The question then becomes: Does your website administrator patch your CMS?
Another issue I ran into related to security was with a different Content Management System. I discovered an undocumented 'back-door' buried in the configuration that gave anyone who knew about it administrative access to the system (allowing them to log into the CMS as an Administrator and giving them the power to delete the entire site if they new what they were doing. Some time later, I found out that some developers who had used this CMS weren't aware of that back door and left it open. I informed them about it, and they quickly (and nervously) slammed it shut. Get familiar with the code you are implementing!
- Code Bloat (importing a bunch of libraries for one simple bit of functionality) Sometimes developers will download a large open-source library to take advantage of a sub-set of functionality to help save time. Unfortunately, this can lead to code bloat - your application is running slow because its loading up a monster library to take advantage of a small piece of functionality.
- Support (or the lack of it) Developers need to be discerning when they decide to use an open-source library. There are vast numbers of open-source projects out there, but one needs to be wise about which one to use. Some simple guidelines to choosing an open-source project are:
- How many downloads (implementations) does it have? The more, the better, because that means its popular and likely reviewed more.
- Is there good support for it? In other words, if you run into issues or errors trying to use it, will it be easy to find a solution for your issue in forums or from the creators?
- Is it well documented? If the documentation is thorough, or if there have been published books written about the project, you're likely in good hands.
- Is it easy to implement? You don't want to waste your time trying to get your new open-source implementation up and working. The facilities and resources are out there for project owners to provide either the documentation or VM snapshots (or what have you) to make setting up your own implementation quick and easy.
- How long has it been around? Developers should wait to implement open-source projects with a short history. Bleeding-edge isn't always the cutting edge. Wait for projects to gain a critical mass in the industry before implementing them if you can help it.
Friday, March 18, 2016
- Its a great, fun read, however it skims over the difficulties and trials of the actual automation of the technical systems and software development process - where the rubber really hits the road. It seems to practically romanticize the idea of automation in IT a bit - like DevOps is the goose that will lay your golden egg. Unfortunately, there's significantly more work to get that golden egg in my experience.
- It also glosses over how to get the Security team on board with what DevOps wants to do. In many companies, the Security team holds the trump card and if they decide to change all your certs from 128 bit encryption to 2048 bit encryption (don't laugh, I've seen it happen and we had to regenerate certs for all applicable servers in all envs). Their wish is your command unless you can convince someone influential that that kind of encryption is overkill in a non-prod environment.
Finding and Cracking That Golden Egg
Hurdles I've encountered enroute to the DevOps 'golden egg' are:
- Silo-ed Application Projects.
- Lack of consistent naming conventions for deployment artifacts and build tags/versions is majorly detrimental to the automation process.
- Lack of understanding of the dependencies between application projects (API contracts, or library dependencies). If you don't understand your dependencies between applications, they may not compile corectly, or they may not communicate properly with each other.
- Lack of consistent development methodology and culture between teams. If you have one team that doesn't get behind the new culture, and they insist on manually deploying and 'tweaking' their code/artifacts after deployment, they risk the entire release. Getting everyone on board with culture is challenging.
- Overusing a tool. When you have a hammer (a good DevOps tool), everything is a nail. Generally, many of the DevOps tools out there all have their niche. You could potentially use Chef, Puppet, or Ansible to do all your provisioning and automation. Is that the best solution though? I'm inclined to say no. How much hacking and tweaking are you having to do to get all your automation to work with that one tool? Use the tools for what they are best at, what they were originally made for. Many of these tools have Open Source licenses and new functionality is being added to them all the time. While it might seem that the new functionality is turning your favourite tool into the 'one tool to rule them all', you might have a big headache getting there, trying to push a square peg in a round hole.
- Lack of version control. Everything must be stored in a repository. The stuff that isn't will bite you. VM Templates, DB baselines, your automation config - it all needs to go in there.
- DB Management. All automated DB scripts should be re-runnable and stored in a repo.
- Security. How to satisfy the Security Team? How to automate certificate generation when they are requiring you to use signed certs? How to manage all those passwords in your configuration?
- Edge Cases. Any significantly sized enterprise is going to have/require environments that land outside your standard cookie cutter automation. Causes I've seen for this are:
- Billing cycles - we needed a block of environments that were flexible with moving time so QA didn't have to wait 30 days to test the next set of bills and invoices.
- Stubbing vs. Real Integration - Depending on the complexity of integration testing required, there may be many variations of how your integration is set up in various environments - what end-points are mocked vs. which ones point to some 'real' service.
- New Changes/Requirements - Perhaps new functionality requires new servers or services. This can make your support environments look different that your development environments.
- Licensing issues. When everything is automated, it can be easy to loose track of how many environments you have 'active' versus how many you are licensed for. License compliance can be a huge issue with automation - check this interesting post out 'Running Java on Docker? You're Breaking the law!'
- Downstream Dependencies This is where the Ops of DevOps comes into play. Any downstream dependencies that your automated system might have need to be monitored and understood. You can't meet your SLA with your client if your downstream dependencies can't meet that same SLA. Important systems to consider here are: LDAP, DNS, Network, your ISP, and other integration points.
- YAML files. Yes, perhaps they are more terse than storing your deployment config in XML. However, I'm at a bit of a loss to see how they are better than a well named and formatted CSV file. Sure, you can 'see' and manage a hierarchy with them. But you can do the same in a CSV file with the proper taxonomy. YAML files utilize more processing power to parse and have extraneous lines in them because their trying to manage (and give a visual representation of) the hierarchy of the properties. I've seen YAML files where these extra lines with no values account for a significant percentage of the total lines in the file, making the file more difficult to maintain and prone to fat fingers. Several major DevOps tools use these files and I really can't see a good reason why except that they were the 'new, cool thing to do.'
Wednesday, March 16, 2016
|DevOps related books I have in my library currently|
Software development methodologies and practices have evolved a lot since then. It's been a challenge to keep up. I'm a pragmatic guy, and I thought one could get pretty much all the functionality needed for an automated build and deployment stack just using Ant, CruiseControl, Kickstart, VMWare, and some helpful API's. Clearly I was wrong.
Cause for Pause
Without a doubt, the DevOps movement has captured the imagination of many a software developer. Just look at all the tools out there now! Chef, Puppet, Bamboo, RunDeck, Octopus, Docker, TeamCity, Jenkins, RubiconRed, Ansible, Vagrant, Gradle, Grails, AWS... I could go on. I'm beginning to question whether or not the polarization and proliferation of these tools has been helpful. I find many companies looking to fill a Devops role are asking for resources who have experience with the specific tool stack they are using. That must make human resource managers pull their hair out. Even personally, I'm concerned about hitching my wagon to the wrong horse. If I decide to accept a position with a company who is using a Chef/Docker stack (for example), but the industry decides that Ansible/Bamboo is the holy grail, have I committed professional suicide? Probably not, but it does make one consider job opportunities carefully.
This P and P (proliferation and polarization) of DevOps tools makes me wonder what the future is going to look like for the DevOps movement. Consider what Microsoft did with C#. Instead of continuing to diverge and go their own way with the replacement for VB, they created a language with a syntax that essentially brought the software development industry back together (in a way). Brilliant move. Java developers quickly ported their favourite tools/frameworks over to C# and suddenly developing with Microsoft tools was cool again. It would sure be nice if something like that happened with the DevOps tools.
Something else to consider for the future... so far in my experience with overseas, outsourced teams, they have not yet embraced the DevOps movement. As a result, they aren't as efficient or as competitive as they could be. When they do jump on the DevOps bandwagon and truly tap into it's potential, it could be a game changer for people like me....
Tools to Watch
- Perhaps Amazon is the 'new Microsoft' with its expansive and ever expanding AWS tool stack? There's definitely some momentum and smart thinking going on there. They appear to have automated provisioning all wrapped up in a bow.
- Atlassian is another company (I didn't realize they were based out of Australia) that is putting together DevOps tools that have a lot of momentum in the industry. Their niche is collaboration tools.
- Puppet and Chef both have a strong foothold in the DevOps community. Both of them are adding new features all the time, enabling them to automate the deployment and provisioning of more products and systems. Many people use Puppet and Chef in conjunction with RunDeck or Docker to get all the automation they are looking for.