Full Stack Story

Adoption of Git backed automation Workflows: GitOps

2018-12-29T11:03:43+00:00

When I first read about GitOps workflow by weave.works, I was a bit surprised by their audacity. I mean moving your deploy button from a central policy & compliance controlled dashboard locked by active directory into the wild in git where it’s visible to prying eyes, dying to commit to see their code up and running in production. That’s really a daunting move.

Especially for a business where every outage means loss in revenue, why would someone take chances with failures?

On second thought, in this cloud era, it all make sense. I can think of two main driving factor for adoption of gitops in coming days

Velocity: Gone are the days when quarterly or monthly deployments were enough. So, we need to minimize manual intervention as least as we could. Excuse of security and compliance should not hinder adoption of new technology. With open mindset you can bring the lost velocity back to your team
Inefficiency of traditional processes and Adoption of more and more open-source tools: People are trying their best to make this world a better place. Adoption of public clouds eg. AWS or Azure, has given all a common ground to solve a problem once and solve it for all. Tools like terraform, jenkins, and vast knowledge spread across stackoverflow is at your perusal. Battles have already been fought. Use their sacrifice to improve life of others.

Now, let’s talk about what are the pre-requisites. So, what has made such pure automation backed deployment workflows like GitOps possibles?

Automate every manual steps: First and foremost be reasonable behind every process you have. Have a constructive arguments for having a manual step for a automatable workflow. Chances are you don’t need those manual interventions.
Good test coverage: Adopt test-pyramid thinking. Have a fine balance between unit and functional tests. More functional tests means more time to complete. So, as much as you can, move 70% of your test cases in unit tests and keep rests for functional and end-to-end tests. This will help you balance velocity and reliability.
Diff: Find a way to generate a diff of your already deployed application and to be deployed one. If you are using k8s you can leverage kubediff or if you are using terraform in your enviroment you can use terraform plan command. This will help you make decisions whether the new version is good to go without any destructive consequences. This step is very important and it’s very important to get it right. Destructive actions could be eg. giving application limit-requests that your cluster can’t handle, or too open or too close firewall policy, opening ports that are not supposed to be opened, etc.
Git branch: Respect git branches and keep all branches as close as possible. For a promotion based environments you need to start with at least two branches: qa and master or (develop and master).

Every commit in qa branch is deployed in the qa environment. When QA guys give you a go-ahead for production, create a pull request out of QA branch and merge it in the master branch. This will be the trigger for deployment in the production. If anything fails, don’t rollback anything manually. Commit, send a new pull request and repeat.

Sounds easy but you can’t go ahead and start doing gitops right away. You need a headstart. So, here are some simple steps to help you get started with gitOps.

Create new repository
Create three(or more as per your enviroments) branches: QA, STAGING and PROD
Create deployment desciptors. One descriptors for each individual services which has the minimal application configuration required by your deployment automation. Eg.
- Application version
- healthcheck_endpoints
- healthcheck retry and timeout: Some application e.g java ones just takes a bit time to bootstrap.
- domain and port
- cpu/ram quota: For docker/k8s applications, have a hard limit on resources
- required ports
- notification channels: [Emails][,Slack channels]
- maintainer address
Copy same descriptors in all the branches. By looking at those descriptors one should be able to tell which environment has what version of a paritcular service running. By looking at commits he should also be able to tell last deployments history.
Use jenkins to write following generic parametrized pipelines
- Healthcheck pipeline: Given set of parameters eg. service name, it goes and run health check for the service.
- Deploy Pipeline: Given service with version and enviroment, it should go and deploy it.
- Smoke Test Pipeline: Generic parametrized pipeline to run post deployment checks on a service specified
- Regression Test Pipeline: Generic parametrized pipeline to run end-to-end tests for any service specificed in the run arguments.
Now, with all the generic pipelines ready, you need to build a high-level pipeline that combine all these steps together. These will be our Release Pipelines. It should have a notification stage before and end of pipeline to notify stakeholders including JIRA.
- QA Release Pipeline: Deploy Pipeline + Healthcheck Pipeline
- Stage Release Pipeline: Deploy Pipeline + Healthcheck Pipeline + Regression Pipeline
- Prod Release Pipeline: Deploy Pipeline + Healthcheck Pipeline + Smoke Test Pipeline
At this point you should be able to deploy your service using above release pipeline, passing parameters like service, version and environment to deploy any application in any envrionment. But only one flaw: It still require manual button press. Lets trigger these release pipeline using Git
Git-Monitor Pipeline: This pipeline will track pushes in each of the branches and based on the diff using tools discussed above, will trigger any of the Release Pipeline. Be caseful for writing this pipeline. It should have following notable checks
- Commit message must not contain any message like skip deployment. If it does, then you’re not suppose to deploy this commit
- Committer should be a whitelisted candidates
- Commit message should have a valid JIRA ticket, where the pipeline go and check for valid approvals and commit deployment statuses as per compliance requirement.
- Lastly, take the diff of the deployed version and the requested version to make decision to allow or reject.
- IF all is good, call the { <Environment> Release Pipeline } to deploy your changes.
- While deploying for higher enviornments eg. prod, please take more conditions under consideration that suits your company culture and policy.

Git Branch Policy

Idea of promotional deployments is to move changes from lower environment to higher environments and each increment gives you a sense of more and more reliable changes coming upstream. Hence, it’s important that every environment has similar infrastructure setup( at low scale) and same set of instructions, and deployable artifacts.

QA Branch : Ideally, every commit should make it here with least friction possible. Make your nightly build job pipeline to auto-commit the new version in this branch whenever the new application version is ready to trigger automatic deployment. QA are made to be broken, so failing fast is good. Hence, it should have zero manual intervention.
Stage Branch: That should be comparatively stable and very close to production. If it’s broken then folks like capacity planning, regression testers will be very mad at you. Once you have a stable QA version ready, create a PR and merge it in this stage branch after peer review.
Prod Branch: Apply the same scrutiny here.

Extras

Integration with JIRA and Slack.
On every deployment failure, create a JIRA ticket and assign to the committer. This will help you improve reliability of your dpeloyments gradually
Pull reports out of git commits, JIRA tickets and pipelines to keep track of frequency of deployments, time to deploy, frequency of failed deployments, and use these metrics to improve your process.

I’ve done something similar in our docker based services.

Benefits

No baby sitting or hand holding during deployments.
QA, Devs, Testers no longer has to maintain or worry about deployments. They know new version is just one commit away.
Rollback is easy and transparent to the team.
Bringing sense of ownership to every commit: You break it, you fix it

Well, think no more, with unwavering faith in your team plunge into GitOps. And you should be able to reduce a significant toil work with this full fledge automatic deployment and rollover workflow.

This post is a summary of our adoption to GitOps workflow and learnings that are worth sharing with others.

Adoption of Git backed automation Workflows: GitOps was originally published by Sonu K. Meena at Full Stack Story on December 29, 2018.

Site-to-site Vpn Setup on Aws

2017-01-06T09:30:05+00:00

How-to guide on setting up site-to-site vpn across regions.

VPC peering allows you to peer VPC’s as long as they are in the same region and have unique CIDR. But what if your VPC’s are across regions.

Lets say you want connectivity between servers running in two different region: Singapore and Mumbai. You may want to setup database slaves in different regions for disaster recovery or your application may want to reach client side for business requirement? And, being DevOpSec you also want the traffic flow to be fully encrypted as well.

Whatever may be the reason you want fully encrypted traffic flow to-and-fro vpc in one region to another region. This article is about setting up one such solution to this problem using IPSec.

IPSec comes into picture

IPSec is an Internet Engineering Task Force (IETF) standard suite of protocols that provides data authentication, integrity, and confidentiality as data is transferred between communication points across IP networks. Best part is IPSec provides data security at the IP packet level. IPSec emerged as a viable network security standard because enterprises wanted to ensure that data could be securely transmitted over the Internet. IPSec protects against possible security exposures by protecting data while in transit.

Ipsec-tools , openswan, strongswan, libreswan etc are few such implementations of IPSec Protocol.

IPSec emerged as a viable network security standard because enterprises wanted to ensure that data could be securely transmitted over the Internet. IPSec protects against possible security exposures by protecting data while in transit.

How are we doing?

Read through this article before continuing from here. I strongly recommend it. It’ll help you understand configuration parameters better.

Launch two servers one in each VPC in public subnet with new security group
Install openswan on both of them
configure openswan
configure both VPCs route tables
Test connectivity

For understanding let say we have following VPC’s in our infrastructure

region                  Private IP              Public IP        Subnet
-------------------------------------------------------------------------------

Mumbai                 172.19.1.132         52.66.100.2        172.19.0.0/16
singapore(stage)       172.27.7.141         54.155.125.80       172.27.0.0/16
singapore(dev)         172.25.135.54        52.67.12.236         172.25.0.0/16

Installing Openswan on CentOS 7

Make sure you are launching a public server and accessible via internet. Thereafter, install openswan on them.

yum install openswan lsof

Install it at both side servers in singapore stage vpc and mumbai vpc.

Configure openswan

Singapore Stage VPC:

Create configuration file and put obvious details. For more understanding consider ‘left’ as your source i.e currently loggedin server detail and ‘right’ is your destination side details

Create configuration file stg-sg-to-mumbai.conf

cd /etc/ipsec.d
cat > stage-sg-to-mumbai.conf
conn stage-sg-to-mumbai
  type=tunnel
  authby=secret
  left=%defaultroute
  leftid=54.155.125.80
  leftnexthop=%defaultroute
  leftsubnet=172.27.0.0/16
  right=52.66.100.2
  rightsubnet=172.19.0.0/16
  pfs=yes
  auto=start

For authentication we’ll be using Pre Shared Key.

Create secret file with format similar to shown below:

	leftPublicIp RightPublicIp: PSK <KEY GOES HERE>

Create stage-sg-to-mumbai-secret.secrets

	cat > stage-sg-to-mumbai-secret.secrets
	54.155.125.80 52.66.100.2: PSK "mySuperSecretGoesHere"

Restart swan and run verify

service ipsec restart; tail -F /var/log/messages
service ipsec status

To verify use following command:

sudo ipsec verify

Next, disable check for source destination check. Go to Action -> Networking -> Change Source/Dest. Check -> Disable and disable it.

Security group need to be tweaked too:

Type              Protocol       PortRange           Source                      Why
All traffi          All                All           172.19.0.0/16           ie. receive from mumbai vpc
All traffic         All                All           172.25.0.0/16           ie. receive fromm singapore dev vpc
All traffic         All                All           52.66.100.2/32        ie. receive from mumbai openswan server
SSH                 TCP                22            <myJumpHostIp>/32       ie. ssh access via jump host only
ALL ICMP            ALL                N/A            0.0.0.0/0              ie. for mysql master-slave replication to work

Mumbai VPC

Create configuration file stg-mumbai-to-sg.conf

cd /etc/ipsec.d
cat > stg-mumbai-to-sg.conf
conn stg-mumbai-to-sg
  type=tunnel
  authby=secret
  left=%defaultroute
  leftid=52.66.100.2
  leftnexthop=%defaultroute
  leftsubnet=172.19.0.0/16
  right=54.155.125.80
  rightsubnet=172.27.0.0/16
  pfs=yes
  auto=start

Again create secret. Make sure both side openswan servers has same secret key. Otherwise authentication will fail.

cat > stg-mumbai-to-sg.secrets
52.66.100.2 54.155.125.80: PSK "mySuperSecretGoesHere"

Restart swan and run verify

service ipsec restart; tail -F /var/log/messages
service ipsec status

To verify use following command:

sudo ipsec verify

Next, disable check for source destination check. Same way as we did in singapore region. Consult screenshots above.

Security group need to be tweaked too:

Type                Protocol            PortRange         Source                     Why
All traffic         All                All           172.19.0.0/16           ie. receive from mumbai vpc
All traffic         All                All           172.25.0.0/16           ie. receive fromm singapore dev vpc
All traffic         All                All           54.155.125.80/32        ie. receive from singapore openswan server
SSH                 TCP                22            <myJumpHostIp>/32       ie. ssh access via jump host only
ALL ICMP            ALL                N/A            0.0.0.0/0              ie. for mysql master-slave replication to work

oh wait.. we have two vpc running in singapore region: Stage and Dev. How do we connect Dev VPC also with mumbai VPC ?

We’ll be required to launch one more openswan server running in dev VPC and make configuration at mumbai side vpc to receive traffic for this also.

Can’t we use the same openswan public server that we’ve launched in stage VPC?

No. because route tables are in different VPC. Route table won’t let you select instance running in different VPC to route traffic to. Therefore, you cannot your routing rules there. Hence, a second server need to be launched in dev vpc too.

At mumbai side, since we have only one VPC we can use the same openswan server but with one more configuration to add this dev vpc network information. Also, we’ll add one more route entry in the same route tables. Example entries are shown below.

To have singapore dev vpc connectivity with mumbai VPC

launch openswan server in public subnet of dev vpc using steps given above. Make sure you have a new security group here also.
Add following configuration changes

Singapore Dev VPC

cd /etc/ipsec.d/
cat > dev-sg-to-mumbai.conf
conn dev-sg-to-mumbai
  type=tunnel
  authby=secret
  left=%defaultroute
  leftid=52.67.12.236
  leftnexthop=%defaultroute
  leftsubnet=172.25.0.0/16
  right=52.66.100.2
  rightsubnet=172.19.0.0/16
  pfs=yes
  auto=start

Don’t forget to add secret file as well

cat > dev-sg-to-mumbai.secrets
52.67.12.236 52.66.100.2: PSK "mySuperSecretGoesHere"

Restart swan and run verify

service ipsec restart; tail -F /var/log/messages
service ipsec status

To verify use following command:

sudo ipsec verify

Next, disable check for source destination check. Same way as we did in singapore region. Consult screenshots pasted before.

Mumbai VPC

Since we’re using same VPC we can use the same openswan server and add just a new configuration there

cd /etc/ipsec.d
cat > dev-mumbai-to-sg.conf
conn mumbai-to-ap
  type=tunnel
  authby=secret
  left=%defaultroute
  leftid=52.66.100.2
  leftnexthop=%defaultroute
  leftsubnet=172.19.0.0/16
  right=52.67.12.236
  rightsubnet=172.25.0.0/16
  pfs=yes
  auto=start

Add secret as well

cat > dev-mumbai-to-sg.secrets
52.66.100.2 52.67.12.236: PSK "mySuperSecretGoesHere"

Restart swan and run verify

service ipsec restart; tail -F /var/log/messages
service ipsec status

To verify use following command:

sudo ipsec verify

Modify route table

We need to modify vpc route tables so that we can route traffic from singapore vpc to mumbai vpc and vice versa. Route table simply says packet with given Destination should go via this Target. And ‘Target’ will send packet to its right owner.

Example:

Mumbai VPC Route table

Here we’re modifying mumbai vpc route table. We’re telling it that packets with destination <172.25.0.0/16>(singapore dev vpc subnet) or <172.27.0.0/16>(singapore stage vpc subnet) should go via mumbai openswan server identified by instance id as shown in the screenshot. Do similar changes for all route tables whosever subnet you want connectivity to be setup.

Same changes are required at Singapore side stage VPC routebles.

Test Connectivity

Lets say Server-A-In-Sg listening on port XYZ want to reach Server-B-In-Mumbai listening on port PQR. Server-A-In-Sg security group will allow port XYZ from Server-B-In-Mumbai private ip say, 172.19.x.y.

Similarly, Server-B-In-Mumbai security group should allow traffic for port PQR from Server-A-In-Sg say, 172.25.p.q . That’s all required. You dont’ have to add openswan server ip also again here in security groups because they are already added in route table.

PS: For mysql-slave replication to work do enable ICMP traffic also.

To test you may use telnet utility:

From Server-A-In-Sg server try reaching Mumbai side server:

telnet 172.19.x.y XYZ

From Server-B-In-Mum server try reaching Singapore side server:

telnet 172.25.p.q PQR

If you see conected message you are all up.

Conclusion

Openswan configuration is very easy to understand and write. On AWS all firewall level settings are taken care by security grups and route table thereby making process more snappier. One may have some concerns for production setup. Like how to make it highly available? There Linux-HA can be used with floating ip technique. But more on this, perhaps in next article.

Connecting Multiple VPCs with EC2 Instances(IPSec)

Site-to-site Vpn Setup on Aws was originally published by Sonu K. Meena at Full Stack Story on January 06, 2017.

FaaS for Ec2 Auditing

2016-09-17T12:25:59+00:00

We know FaaS, function as a service are now the big thing in the market. AWS lambda, and webtask.io are few good implementations that help you write serverless code.

You can read more about serverless architecture on martin fowler blog.

Here i want to pen down one of the use case that fit perfectly for FaaS: Auditing

With growing team it become difficult to audit your infrastructure as more and more resources are being created. Are tagging standards are being followed or not, are security groups security guidelines are followed or not, or the right ssh-key pair, image-id or what-not is used for creation of new resources or not.

So, wouldn’t it be great if we can write one service that poll aws resources periodically and get us this audit report? YES. But …

You then have to create infrastructure for this service or cron to run. Launch server, create image, put monitoring and what not. It’s a pain.

But there is one friendly, quick and very cheap way of doing it. Enter: FAAS.

Create your function and you are done. Seriously. That’s all it require.

Webtask.io is one such FaaS. Though AWS lambda is also there but i found webtask.io deployment more easier and snappy.

Difference between AWS Lambda vs Webtask.io

webtask.io currently supports only node.js and it support it pretty well. AWS Lambda on the other hand have programming langauge supports.
webtask.io deployment is easy but how does it handle new deployment is kind of scary. All happen behind the back for you. There is no canary testing support. Like in case of failure you want to rollback to a old version. In webtask.io you have to deploy it again, where in AWS Lambda you just need switch pointer to last working code as in capistrano.
Code written for webtask.io can be run as it in aws lambda without little or not change at all. Webtask.io support AWS lambda compatible programming model as well.
webtask.io has rich editor that not only highlight code, but also do syntax check. 70% errors you can avoid here only.
AWS lambda docs are very rich and you can finally, every piece of information there.
realtime debugging is very easy in webtask.io with their streaming logs. It mean less context switching.
pricing ???

While working with webtask.io i found some areas that can be improved further. Here are some of them:

Text editor has syntax check support but would be great if before deploying webtask client itself can run a syntax check and throw error on console. It’s painfully very sad seeing your baby crying due to a typo.
Webtask.io secrets and data storage strategy is very simple to use but difficult to figure out in the starting. Little more documentation could have helped here.

Here, in this short tutorial i’ll be using webtask.io for writing one auditing service.

What are we doing?

We’ll be creating a auditing service that check if launched instance has tags as per audit rules or not. Resource taggging has many benefits but important one is to track aws resource billing

How are we doing?

We’ll be writing a node.js program that will listen to cloudwatch events. Cloudwatch is an aws service that emits aws events you can subscribe to. These events can be a launch of an ec2 instance. This is what we’ll be tracking. To subscribe out audit service to cloudwatch events we must create AWS SNS topic where cloudwatch will send filtered out events pertaining to our rules only.

When an instance is launched, event is emitted by cloudwatch and goes to SNS topic. SNS topic(ec2_launches_check) will then hit our audit serivce endpoint with payload containing instance id. On this instance id, our auditing service will run audit checks.

Currently, we’re checking if instance has “name” and “service” tag present or not. If it violate this rule, then our auditing service will dispatch alert. There are different places where this alert logic can be put:

Write in the code to send mail/pagerduty alert using 3rd party services like mailgun, etc.

There is a problem with that. If tomorrow you have to update alert subscribers list, you then have to modify the code or update the environment.

Wouldn’t it be better if we can publish alert to one middleware where subscriber can be added and removed conveniently through nice GUI? AWS SNS topic is one such place. So, our audit service will publish alert to another AWS SNS topic(ec2_audit_alert). And subscribers can subscribe to it via SMS, EMAIL or call another 3rd party service.

Imlementation: show me the code

Create two aws SNS topic: ec2_launches_check, and ec2_audit_alert
Write audit service in node.js
Create webtask.io task and deploy it with aws credentials and alert topic arn(ec2_audit_alert). Code is smart enough to subscribe to this sns automatically.
Create aws cloudwatch event rule and point it to ec2_launches_check sns.

Here is the result

That’s it. You can the find project repo here

FaaS for Ec2 Auditing was originally published by Sonu K. Meena at Full Stack Story on September 17, 2016.

How to Write a Cron?

2016-04-04T12:02:35+00:00

How to write a good cron?

A good cron should adhere to unix philosophy of single-responsibility-principle. To elaborate further, your cron should do only one thing and should do it right. Keeping cron code simple and modular will not only help you debug issues easily, but also help you inherent learning from one cron to another.

Here are few learnings that one can use while writing workers.

Q. What cron should do?

Cron script should be light weight and should work as a helpers for workers.
Cron should create a job , push them to queue and finish. These job would then be processed by workers.
Cron should fail fast and finish fast

Q. What cron should not do?

Cron should never do any processing task.
Cron should not to do any long running operation failure of which can create inconsistencies

Other important considerations:

Logging

Logs are your single source of truth. If they’re lost, you are left scratching scratching your head finding answers to simple bugs
What to log and what not to log, however could be tricky. To keep it simple we advise you never to log any client/customer data. If credentials or secrets are required, that should also not be part of your log output.
Here are few simple commandments to follow:
- Do no log client/customer data
- Secrets and credentials should not be part of logs
- No secrets should be provided as run arguments instead use wrapper run script, if required
- Make use of environment variables as much as possible.
Ship logs to graylog

Alerting

If your cron raise error it should raise alert right away.
Handle every exeception, categorise them and push them to sentry. If immediate attention is required then don’t be afraid to raise pagerduty alert
post errors/warning to sentry

FAQ

Q. How to know if my cron run successfully or not?

A. — Read Logging part—

Q. How to get alert on failed cron run?

A. Cron can be failed due to three reasons:

System kill it eg. oom killer Why was cron doing resource intensive operation at first place? Make it light weight
Exception raised in code Poor exception handing can abort cron. So, improve code quality and log exception on sentry or graylog. On exception raise pagerduty alerty right away
Somebody kill/shift/removed cron from server – Read next faq –

Q. What if somebody removed shift/removed cron from server?

A. Improve team collaboration. Keep everybody in sync. Avoid manual changes on server. Make crons part of code deployment. If automation is not in place, contact DevOps to bring it ASAP

Q. How to check if cron run on particular day or not?

A. Check in graylog. If it’s not there check in sentry. If it is nowhere, god bless you

Q. Shouldn’t devops monitor cron for us?

A. Go bald and die

How to Write a Cron? was originally published by Sonu K. Meena at Full Stack Story on April 04, 2016.

How to Write a Good Worker?

2016-04-04T08:57:53+00:00

How to write a good worker?

A good worker should adhere to unix philosophy of single-responsibility-principle. To elaborate further, your worker should do one thing and should do it right. Keeping worker code simple and modular will not only help you debug issues easily, but also help you inherent learning from one worker to another.

Here are few of these learnings that one can use while writing workers. In the end you’ll have a robust worker that work as intended.

Logging

Logs are your single source of truth. If they’re lost, you are left scratching scratching your head finding answers to simple bugs
What to log and what not to log, however could be tricky. To keep it simple we advise you never to log any client/customer data. If credentials or secrets are required, that should also not be part of your log output.
Here are few simple commandments to follow:
- Do no log client/customer data
- Secrets and credentials should not be part of logs
- No secrets should be provided as run arguments instead use wrapper run script, if required
- Make use of environment variables as much as possible. They are more reliable than human provided configs
- Use graylog to send logs to, to avoid SSH

Retry->Reque->Die

If worker run on some queue than for every message it failed to process, it must push it back to queue to retry again.
It should not retry forever. After , let say 3 times retry it should raise alert
Use aws SQS deadletter queue and visibility timeout feature for this

Alert

If your worker found any critical issue, it should raise alert or inform other rather than depending on other to raise alert for you

Third Party Service

If worker is relying on any third party service, it should use circuit-breaker pattern
- Put timeout on these service response time
- If service didn’t respond back , retry again and fail fast
- Report or alert if 3rd party service is not reliable enough.
- http://doc.akka.io/docs/akka/snapshot/common/circuitbreaker.html

How to Write a Good Worker? was originally published by Sonu K. Meena at Full Stack Story on April 04, 2016.

How to create a Good AMI?

2015-07-21T10:55:58+00:00

What is Amazon Machine Image(AMI)?

Quoting from wikipedia:

An Amazon Machine Image (AMI) is a special type of virtual appliance that is used to instantiate (create) a virtual machine within the Amazon Elastic Compute Cloud (“EC2”). It serves as the basic unit of deployment for services delivered using EC2. ~wikipedia

Thank you wikipedia (:

Benefits of using AMI

fast server provisioning and spinning
- With the base system ready you don’t need to perform same provisioning steps evertime you spin up new instance. Server Provisioning may include ntp setup, user provisioning eg. grant team access to server, application provisioning eg. installing app/web servers , or embedding orgnaization/3rd party credentials in environment variables and many more.
- Since all steps are pre-baked in AMI what left for server is to start the service manager and start serving requests. AMI help you reduce new server provisioning time from 20-30 minutes to mere couple of minutes.
Be ready for traffic spikes in time
- With surge in traffic if your infrastructure didn’t scale in time, there is no point of putting auto-scaling policies. Your server should be fast enough to spin fast so that they can start receving requests fast. AMI help minimize this time and thus answer your growing traffic in time.
Known state of server
- Since all servers are spinned from same AMI it’s guarranteed that all are running same packages at same version. Thereby, eliminating surprises. Nobody likes snowflake servers and we should always strive to avoid configuration drift. This is one step towards immuatable servers

Considering benefit of AMI, it’s essentials to create a good AMI. Here are some steps that you can consider when you create AMI next time.

How to create a good AMI?

Check List:

Let bootup complete proper. check /var/log/boot.log. Maybe even let run for 5-7 mins before proceeding.
Update all system packages and reboot. ensure all is good here.
Stop all running application services
Check the mail queue (mailq/sendmail)
If needed flush the queue
once the queue is empty, you’re ready to actually make the image
Delete Shell history
Clean authorized keys
double check the code is deploying from the right source, boot it up again just in case, double check its using right source etc (if chef/puppet in use)
Issue stop to this instance
Proceed with ami creation

Delete Shell history

Always delete the shell history before creating your AMI.

find /root/.*history /home/*/.*history -exec rm -f {} \;

Clean authorized keys

If you’re creating Public AMI you may want to perform following steps:

Exclude SSH authorized keys before creating your AMI.

  find / -name "authorized_keys" –exec rm –f {} \;

Ensure that your private credentials for third-party applications and remote services are deleted

locate “authorized_keys” files on disk, when run as root:
```
	find / -name "authorized_keys" -print -exec cat {} \;
```

WARNING: Please execute above commands only if you know what you’re doing only.

Your check list may differ depending on your use case. However, with this article i want to share some best practices, i developed while working on AMI, across.

Thanks for dropping by ;)

References:

How To Share and Use Public AMIs in A Secure Manner

How to create a Good AMI? was originally published by Sonu K. Meena at Full Stack Story on July 21, 2015.

So You're Writing Api Client

2015-06-08T19:25:29+00:00

With the advent of internet of things, especially rise of mobile devices has made modern web application to be served as SaaS: Software as a servic or simple service.

Now browser only don’t account for traffic surge but mobile applications as well. Few companies takes this to a new level by providing their service as a platform for other developer to write their own application above it. Thereby, rise in traffic is inevitable.

With start of internet of things API layer has become norm. This post is about writing clients that will consume API ie. API Clients.

If you’re writing API client first thing you need to decide if you want to show http response code to users or not? If not, then how will you intimate user about errors? use custom error code. This is also good.

However, there is a better way: best of both world: Put both of them. Put status to align your response code with http status codes

I personally feel attaching HTTP status code with every response is a nice way of enriching api-client response.

{
	data: "(optional) My data goes here",
	"error": "(optional) My error,if any, goes here"
}

Looking at above response consumer will only learn partial truth. It doesn’t say anything about api server response. This response block might work well with small number of api endpoints where errors could be made familiar after a very short usage, but doesn’t scale well beyond it.

Let’s enrich our response block with little few more fields.

Starting with errors first

Let’s start with an example :

Twitter api error response look like this:

{	
	"errors":[
		{
			"message":"Sorry, that page does not exist",
			"code":34
		}
	]
}

Excerpt from Twitter Api documentation:

If you see an error response which is not listed in the twitter error code table, then fall back to the HTTP status code in order to determine the best way to address the error.

This says we need http response code as well if error code is missing in api resopnse. So, in api client response there should be http status code.

A simple informative api client response could be :

{
"status" : 401,
"message": "Authentication",
"code" : 2334,
"more_info" : "http://myApp.com/docs/errors/2334"
}

This response block shows where to go head next to get more information on error.

status: is aligned with http status code.
Code: is used to add more information to this specific error.
more_info: In API design we always strive to be verbose. There is no harm. You can also give link to more documentation by using more_info field in your response block.

Enriching response

Having dealt with errors we can now proceed further with returning good responses.

I personally like attaching following field to api response:

Status - HTTP Status code
Data (optional) - Any data returned
Error Code (optional) - HTTP code associated with the error
Error Message (optional) - Message associated with the error
Error Explanation(optional) - Additional error info

Verbose? but verbosity is good otherwise developer has no way of knowing what are they doing wrong?

You don’t want them go bald by leaving them pulling their hair with no information but error code in response. Right?

Final response block

Combining all good parts together final response should include following fields:

status : HTTP Status code
data(optional) : Data retreived as a result of request. It could be an array or single object
errors(optional): Array of errors with message, code and more_info fields

{
	"status": <HTTP Status code >,
	"data": < Any data returned>,
	"errors" : [
		{
			"message": <short error description>,
			"code" : <error code >,
			"more_info" : < url to more info page"
		}
		...
	]
}

Isn’t it cleaner, informative and that without loosing any peice of information.

References

http://stackoverflow.com/questions/942951/rest-api-error-return-good-practices
https://blog.apigee.com/detail/restful_api_design_tips_for_handling_exceptional_behavior
https://blog.apigee.com/detail/restful_api_design_what_about_errors
https://dev.twitter.com/overview/api/response-codes

So You're Writing Api Client was originally published by Sonu K. Meena at Full Stack Story on June 08, 2015.

Hands on Facebook #reactjs

2015-04-01T02:53:21+00:00

With the news of facebook launching react-native, i couldn’t stop myself from learning ReactJS.

Earlier than that, i often ask myself this question on a thought of learning new javascript frameworks : Why do i need to learn one more?.

Wel, News of react-native pique my interest and forced me to step out of my comfort zone. I fire up the browser with ignited spirit to devour ReactJS. (:

First Impression

With a background of MVC taught by backbone and angularjs, reactJS was totally new. ReactJS is V in MVC . I then thought how’s it different from angularJS Directive or ember components?

I also got confused by one-way reactive data flow. What does it mean? Data often moves from back & forth. ReactJS uses virtual DOM. So, how is it useful?

Without mulling over it much, i decided to build an RealTime application using it and when it comes to realTime twitter live feed often comes into mind.

So, i made Realtime status tracking application using ReactJS: http://track-tweets.herokuapp.com/. It took me only two days ( need not to forget i go to office in day time) to built this application.

Feature list:

Show tweets on page load (render at server side)
As user scroll down and reach near bottom, more tweets appends from mongoDB powered database (lazy loading)
As new tweets show up on twitter matching track string, user get realtime notification about it

Please find source code on github: sahilsk/track-tweets

In this post, i want to share my views on facebook ReactJS.

NOTE: Views are totally personal and has nothing to do with organization, people or girls i might be connected to..

Small learning Curve

Well, having removed M and C from MVC you’re left with only V to learn and that is ReactJS. So, learning curve is drastically reduced. In couple of minutes you can get started and rest you can learn as you go.

There getting started guide incredibly simple. Why incredible? because i didn’t trust it and take google help to search more advance ReactJS tutorials. You can save time if you stick with it( though googling won’t harm either).

You don’t need to learn about providers, service , factory or any more jargons you may have seen in other frameworks.

Tidy manageable code

In writing realTime application code often go messy. In reactJS components help you organize your code by allowing you to divide big application into smaller component.

For my application here the application components when put together look like:

/react/components/tweetApp.react.js

---
render: function(){

  return (
    <div className="tweetApp_wrapper"  >
      <Tracker track={this.state.track}  />
      <Tweets tweets={this.state.tweets}   />
      <Notification count={this.state.unreadTweets.count}
        showUnreadTweets={this.showUnreadTweets}
          />
      <Loader loading={this.state.loading} />					
    </div>
      
  )
}
---

Cool stuff

ReactJS Virtual DoM render the whole DOM in memory first and then push only the diff into the browser one. It not only make browser rendering fast, but also enable you to render the DOM at server side. Cool isn’t it?

In my application when page open up it’s ReactJS server side rendered components that you see. As shown in the snippet below renderToString method render component as string which i’m passing to my expressJS index view template.

/server/routes/index.js

---
var markup = React.renderToString( TweetApp( {"tweets": tweets } ) );
res.render("index",  
  { "layout": '../../server/views/layouts/main', 
    "tweets": JSON.stringify(tweets), 
   "markup": markup  
   } );
---

Why is it important? Well, i care for my website reachability. So, i care about it SEO. Though google crawler interpret javascript, you still have to rely on third party services like prerender.io or phantomJS to boost your SEO.

What next?

Facebok react-native: React Native enables you to build world-class application experiences on native platforms using a consistent developer experience based on JavaScript and React.

At the time of writing this article react-native supports only iOS.

Facebook Flux: application architecture bolstering uni-directional flow in applications.

Facebook Flow : Adds static typing to JavaScript to improve developer productivity and code quality. http://flowtype.org/

…and hopefully more in future. Be hopeful please…

Conclusion

ReactJS, Flux & Flow are more about philosophical expect of development rather than learning few more technical jargons terms. phew…

My new year resolution is to learn more on ReactJS, REACT-Native, flux & flow.

Source code: sahilsk/track-tweets

Hands on Facebook #reactjs was originally published by Sonu K. Meena at Full Stack Story on April 01, 2015.

MongoDB Standalone Server in Docker

2015-02-18T14:13:10+00:00

MongoDB 2.4.5 inside Docker

Dockerfile

Dockerfile is to Docker what Makefile is to make

file Dockerfile

#
# MongoDB 2.4.5 Dockerfile
#
#
FROM ubuntu:12.04
MAINTAINER stackexpress "http://stackexpress.com"
RUN apt-get update
RUN apt-get install -y make gcc wget
RUN wget http://fastdl.mongodb.org/linux/mongodb-linux-x86_64-2.4.5.tgz -O /tmp/pkg.tar.gz

RUN ln -s /opt/mongodb/bin/mongo /usr/local/bin/mongo
RUN ln -s /opt/mongodb/bin/mongod /usr/local/bin/mongod
RUN (cd /tmp && tar zxf pkg.tar.gz && mv mongodb-* /opt/mongodb)
RUN rm -rf /tmp/*

RUN mkdir -p /data/db
# Define mountable directories.
VOLUME ["/data/db"]

# Define working directory.
WORKDIR /data

EXPOSE 27017
EXPOSE 28017
CMD ["/opt/mongodb/bin/mongod", "--rest"]

Build

Now we’ll build the above Dockerfile and tag it under sahilsk/mongo_2.4.5.

## CD to 'Dockerfile' directory
$ docker build -t sahilsk/mongo_2.4.5  .

RUN

$ docker run -d sahilsk/mongo_2.4.5

Lets do some benchmarking

We’ll employ mongoperf here.

Mongoperf is a utility for checking disk i/o performance of a server independent of MongoDB. It performs simple timed random disk i/o’s.

…to be continued

MongoDB Standalone Server in Docker was originally published by Sonu K. Meena at Full Stack Story on February 18, 2015.

Database in Docker

2015-02-18T13:29:37+00:00

Rarely you might have heard of people running database in container in production environment. If there are any, gutsy they must be.

Database is critical component of every three tier service. With advent of NoSQL especially Document type databases, like MongoDB, database selection also play a role in deciding your technology stack. Though with failure of any component in stack may take down your whole service along, however, with database down you also looses data.

Critics may say, Hey, same is true for any stateful component. But how many stateful component you see beside database nowadays? Smart people don’t uses file for persisting any information. Almost, all application states including sessions are stored in persistent fast accessible layer which will be 99% a database.

So, Should i dare to run database in Docker?

Very daunting question it might look but if we paraphrase it, it looks like this:

With database running,

I should not lost my data.

My data should persist on disk and should stay there even when service is stop.

I should be able to collect database metrics

Most of the metrics usually come from logs. Need not to mention logs also help us in debugging in case some anomally occur in operation.

I should be able to shutdwon gracefully, if not in use

Some database require cleaning operation on shutdown phase. It also remove lock and release resources in cleaning step. When database is forcefully killed some resources not get released. Usually database respond to SYSINT and SYSKILL command by capturing them and taking decision.

I should be able to start it with last saved data.

When database re-start, it should resume operation from last saved state. State is usually saved in files(disk). So, effective journaling should be there.

I should have no latency i.e
- no network latency
In distributed environment network latency can become a overhead. When there are thousands of IOPS per seconds it can easily become bottleneck.

no disk i/o latency

Though memory and journaling help in minimizing disk i/o operations. But a faster disk access always lead to faster throughput. With hardwares getting cheaper, SSD has gained popularity. Being fast often means fast operation.

I should be able to scale it

Through master-master replica or master-slave replica

Not that daunting now. Right?

Let me recap docker performance impact on application running inside it. In performance, it’s tantamount to speed you get natively. Here’s a brief overview on it.

CPU : CPU performance is totally Native. Same as you get without running inside Docker.
I/O : Docker support many storage backends. Device Mapper is being used by default and can be switched as per need. However, being said this simply switching storage backend won’t get you faster iops that you get natively. However, if you use docker volumes you’ll get native speed. It also has few extra advantages which i’ll cover in some future post.
Memory : Docker set aside a little memory for memory accounting. However, it can be native i.e No overhead, if you disable memory accounting (useful for hptc, probably not for everything else)
Network: There is no overhead if you run with -–net host. It’s useful for > 1Gb/s workloads of if you have a high packet rate eg. VOIP, Gaming.

I might be missing few other things connected especially to application like Database. Pardon my ignorance. Here, i’d appreciate few help from community to add their views in the comment.

Database in Docker was originally published by Sonu K. Meena at Full Stack Story on February 18, 2015.

Application Monitoring Dashboard Solution

2014-12-19T23:53:42+00:00

In this post i’d discuss some of the option available for application monitoring. As your application grows, your technology stack also grows and so the surprises that comes along. These often lead to out-of-service statuses which I presume no organization,small or big, can bear.

So, let’s get started.

Application Monitoring : It can include minimum monitoring of application live status and can extend upto application performance monitoring. When you have all these metrics before you, you’ll be able to gauge your infrastructure efficiency and scalability in more predictive manner.

Imaging your application is getting popular and no. of users are increasing day by day. With given metrics you’ll be able to see how many RAM, CPU or instances are being used and how much is available. With current resource utilization and growing user base stats in hand, you can take necessary measures to ensure smooth operation of your service.

What are these metrics that we should be wary about?

Let’s start from server itself.

Few Important Metrics are: RAM , CPU, Disk space, bandwidh, Disk I/O , DB Read/Write etc.

Disk I/O

Processor will wait till process finish reading file. So, having a fast Disk input/output operations on a physical disk will improve your application performance significantly.

That’s why SSD’s are nowadays popular. 30% faster file opening speed than traditional HDD. File Copy or Write Speed of SSD is typically about 200 MB/s -550 MB/s while of HDD it’s very low around 50-12MB/s.

You can easily find the difference in turnaround time of your Disk I/O heavy application by just changing a single hardware. But do you really require HDD? See your application metrics and answer yourself ;)
RAM

More the ram, more process can run and stay in memory for long. This mean less Disk I/O and page swapping. With hardware getting cheap, having extra RAM won’t cost much.

However, knowing when to increase RAM is the purpose of this post. One option is just take note of RAM usage in last couple of days and decide your next step.
CPU

For computation heavy application just RAM is not enough. You need powerful mind to process them all. So, do watch your CPU performances as well. Generally, if metrics shows CPU usage greater than 80% for most of the time, it means you should upgrade to more powerful machines with more cores inside.
Bandwidth

Not only it’ll help track your monthly bandwidth bills but also help you trace anomaly. Many times watching bandwidth usage pattern give you clues on what’s happing wrong if anything is going wrong.

Next Step is, how to get these metrics?

Start with Web server. If you uses nginx then gather its statistics. Afterward, gather App Server statistics. If you have RoR then you might have used or tried unicorn. So, you might need to dig into its log file to gather metrics out of it.

PHusion Passenger

Phusion Passenger make all this easy as it’s a web server and App Server, both. It provide Administration tools that allow you to detect whether an application is stuck and non-responsive.

It let you Watch and monitor many application-level, process-level and system-level statistics from a central place.
- CPU, memory and swap usage, both system-wide and per-process.
- Connections and requests.
- Hypervisor VM interference rates.
- Load averages, fork rates and swap rates.
- Application-level backtraces.
Best part is Easy integration with external tools. These statistics are queryable over an HTTP JSON API, allowing you to easily integrate these statistics with external tools.

These stats can be pushed into a centralized storage (Mongodb or mysql).
CollectD

collectd is a daemon which collects system performance statistics periodically and provides mechanisms to store the values in a variety of ways, for example in RRD files. Out of the box it provide multitude of plugins ( cpu, memory, network, disk etc). For bandwidth usage you can look at collectd-network-bandwidth-usage plugin.

These stats can be collected in Mongodb through Write MongoDB plugin or can be pushed into Graphite db through collectd-carbon plugin.

How to create a monitoring Dashboard ?

Let me start with minimal monitoring to advance monitoring options that can be used to monitor your application and the servers it’s running on.

Minimum Application Monitoring: (Monit)

monit can be used to monitor your endpoints. It provide a nice and simple web interface to see live status.

Monit is a small Open Source utility for managing and monitoring Unix systems. Monit conducts automatic maintenance and repair and can execute meaningful causal actions in error situations.

It provide a nice Web GUI where you can see live status of application interfaces and services.

If any service goes down or resource utilization crosses threshold, it’ll take action specified by you. Action can be restarting the crashed service, shooting mail to system admin or execution of your provided script or command.

Monit can help you if you have small infrastrcture. But as the infrastructure grows, handling Dashboard for every server would be weary. So, to ease this you can use Monittr.

Monittr provides a Ruby interface for the Monit systems management system. Its main goal is to aggregate statistics from multiple Monit instances and display them in an attractive web interface.

For live monitoring and getting on-time system update on service crash/down or excessive resource utilization, Monit can serve your purpose.

If you want to read resource utilization pattern, to gauge your infrastructure growth and scale infrastructure accordingly you need to store all these metrics somewhere in centralized storage or Database. Next solution let you read past stored data as well.

Recently I come across a application monitoring architecture being used in one of the pioneer bread & breakfast service provider. This solution is taken out of them.

Monitoring like a boss ( Design Yourself)

Phusion Passenger        CollectD+  
        +                    +      
        |                    |      
        |                    |      
        v                    v      
     Mongodb        Carbon(Graphite)
         +-------++----------+      
                 ||                 
                 ||                 
                 ||                 
           +-----vv-----+           
           |            |           
           |   JSON API |           
           |            |           
           |            |           
           +-----+------+           
                 |                  
                 v                  
                                    
          +------------+            
          | Your cool  |            
          | Dashboard  |            
          |            |            
          +------------+

Send Passenger data into MongoDB. Send CollectD metrics into Graphite database(whisper). Graphite provide JSON API interface render url api which can provide data endpoint to real time metrics.

Now, with collected JSON data from MongoDB and Carbon, you can have a common central JSON API endpoint that can power your dashboard to visualize metrics.

Little on choosing technology for building Dashboard

The big problem broken down into small sub problems.

  
  Get Data -> Store data ->  Provide JSON Interface -> Pull & Visualize Data

For building realtime dashboard you can use any javascript MVC framework. AngualarJS or emberJS .

For creating widgets ( gauge, pie chart, geomap etc) you can use AngularJS template directives or emberjs components.

While these approaches are fine, but i would like to introduce web components here, specially polymer.

Web Components usher in a new era of web development based on encapsulated and interoperable custom elements that extend HTML itself. Built atop these new standards, Polymer makes it easier and faster to create anything from a button to a complete application across desktop, mobile, and beyond.

With polymer you can create custom widgets very cleanly. Best part is you can re-use them in your next cool projects.

So, plugging peices together through standard interface you can build a nice appliclation monitoring dashboard for your infrastructure.

Application Monitoring Dashboard Solution was originally published by Sonu K. Meena at Full Stack Story on December 19, 2014.

Do We Need a Application Configuration Manager?

2014-11-11T08:39:10+00:00

Do we need a centralized application configuration manager ?

NOTE: Please don’t get confused with Configuration Management tools like Ansible or opscode-chef. Here, the point is after application deployment on server if any configuration change in run parameter is required, then how to avail those to your already running application?

Let’s start with deployment. For deployment any CM tool will do: Ansible,chef. I prefer Ansible as it leave almost zero footprint on deployed server.

For application configuration management, you’ve two ways:

Run your cm script(ansbile,chef/puppet) again, OR
Make your application listen to configuration change event.

In case of any change, your application should pull changed configuration, update itself and restart (if required,in most cases it’ll be)

I’ll chose first option. And I have reasons for it.

In production environment(especially, of payment sites) I cannot afford my servers to behave in unpredictable manner. Change in configuration should be tested first in staging, then should go through canary testing before replicating changes in all production instances.

If you are planning of creating application configuration manager, then be prepared to make your applications to handle cases that comes with this change. Few of those cases are included here:

what if centralized configuration server is down?

Your centralized configuraiton server could easily become single point of failure and can become bottlneck. So, additional care need to put here of maintaining it. You need to find ways to make it more fault tolerant and highly available. After all you have given promise of 99.999% availability to your end customers.
what to do if configuration change event triggered?

This, however, is the toughest case to handle and the sole purpose of bringing centralize configuration manager in stack. To propagate a change, first need to make sure that on first server changes reflected perfectly i.e Canary Test Passed and it’s running smoothly. Only then, change should be allowed to be reflected on rest of the instances. So, this incremental update you need to implement, handle & test.

Simplest implementation could be just a fast database(preferably in-memory db like redis) with fault-tolerance, & high availability. If you use Redis then you can also make use of its pub-sub feature. If you need to reflect changes speedily, then distributed key-value db like zookeeper or etc can be used.

Basically you’ll end up writing one daemon that does and make decision likes rolling back on failure, notify your monitoring server about the propagated change and its effect: Was it successful ? Why it failed? blah blah?

Oh, dear you just added one more component in your architecture. That mean maintenance, support, bug fixing, release and finally, more headache.

I would prefer first option as it makes my life simple. Here is how?

What makes an ideal production server?

Immutable server:

A server that once deployed, is never modified, merely replaced with a new updated instance. Even slightest change like configuration key change should not be allowed. This way we can keep our infrastructure in known-state all the time.
phoenix server:

Servers should automatically re-syncs with a known baseline. Tools like Ansible, Puppet and Chef have facilities to do this. Due to many chaotic reasons if server reboots then you can simply make use of CM features like chef-client in case of opscode-chef and ansible-pull in case of Ansible, to bring server back to known-state.

I use Ansible to create immutable server instances. If any change i need propagate i re-run my ansible script, update changes, test, and if passes, then run the script on all other servers. For automatically re-syncs on restart/boot i run ansible-pull that pulls changes automatically.

Do We Need a Application Configuration Manager? was originally published by Sonu K. Meena at Full Stack Story on November 11, 2014.

Why docker?

2014-11-09T14:45:14+00:00

Because we all love Docker. <3 <3 <3

Build, Ship and Run Any App, Anywhere - The Docker team

Docker is all the rage. Let me start this post by putting some metrics here taken from docker official github repository on 9th Nov’14 :

16438 github stars

Stars are just another way of saying “I love docker”
Forked 3225 times

Just another way of saying “I care for docker”.
666 active contributors

open-source community is committed in improving docker more and more. Not only this, with recent collaboration with RedHat, Microsoft and few other cloud providers docker is getting more secure, more reliable and now on a path to become cross-platform soon. Yes, window users soon be able to get the taste of this open-source recipe
77 releases

With more releases coming Docker is getting better and better and this metric conveyed the same. With every new release comes good news for the community.

Limit of 127 aufs layer has been removed in earlier version. With 0.9, Docker dropped LXC as the default execution environment, 0.11, comes tightened security , API consistency and more stability, 0.12 with pause/unpause notable feature, 1.0.0 brought more stability and production support, and with 1.0.1 brought enhance security for lxc driver.

You can read more about changelog here

Docker Project was released as open source in March 2013. Since then these numbers are increasing day by day.

Not only this, many startups have been built around docker. Runnable.com, Stackdock, orchard, tutum, quay.io, deis.io, etc to name a few. There is one beautifully created mindmeister map on docker ecosyster created by Krishnan Subramanian. You will find more names in there.

With Heavily active community, more security enhancement and soon getting cross-platform : I think these stats are enough to build confidence in Docker And to bring it in production without cringing. Let me hight few more points on Docker here in this post.

Same performance at almost zero cost

With negligible resource use, you get the same performance as you get running without it.

Well, firstly docker provide a virtual environment for your technology stack to run in, unlike others who merely provide you just another virtual machine where resources are tightly coupled and worst part is they are not shareable. With docker you get almost zero performance downgrade.

In performance, it’s tantamount to speed you get natively. Here’s a brief overview on it.

CPU : Native
I/O : Native on volumes make sure that your data set etc. is on volumes)
Memory : Docker set aside a little memory for memory accounting. However, it can be native. No overhead if you disable memory accounting (useful for hptc, probably not for everything else)
Network: There is no overhead if you run with ``–net host` (useful for > 1Gb/s workloads) (or if you have a high packet rate eg. VOIP, Gaming.)

Productivity and efficiency are just nothing but synonyms of Docker.

So, with docker you can get the same performance for your technology stack as you gets natively. Besides, you can share resources among many containers. Thereby, leveraging resources efficiently, and thus spinning up less number of servers.

Where Security, Reliability, Community & production Support make Docker stand apart, performance and super awesome feature list makes it a must have component of every stack.

Notable docker features

Pause/Unpause container

While Stop and Start feature was already there in the Docker, but with recently released docker, you can now pause and unpause your container as well.

Why do you care? You can always use SIGSTOP and SIGCONT on all the processes in the process tree for a container to implement it yourself but they are not always sufficient for stopping and resuming tasks in userspace.

That’s where cgroup freezer comes into picture and same is now implemented in docker to avail pause/unpause functionality.
Docker also exposes this functionality via its API.
Build Once, Run Anywhere

No more configuration drift, no more surprises : This is what Docker guarantee. This is a huge win for DevOps community.

A experience sysAdmin know steps used to successful deployment of stack on one machine doesn’t necessarily bring the same result in other machine as well. But with Docker now it’s possible to achieve immutable servers , thereby, keeping infrastructure clean from snowflakes ones. Yeepeee!!!
Zero Downtime Deployment

With the advancement in technology zero downtime deployment is possible. Now you no longer need to put “out of service for maintenance” sign board any longer on your product.

Docker just happen to be one such technology which allow you to achieve this at negligible cost. Without switching back/forth on environment servers, you can deploy your changes, do canary testing and see the features rolling in production right before your eyes without cringing.
Spinning up new instances is a breeze

Some scenerio demands spinning up more server in minimum time to accommodate sudden surge in traffic and when traffic goes down, these servers should perish. Whether you use aws auto-scaling or in-house automation tools, it surely takes minimum of 2 or 3 minutes(without counting server spin-up time) to launch a new application instances. For heavily traffic sites like e-commerce on events like big sale or new product launch sale , these 2-3 minutes is enough to wipe out the entire stock in a flash. Recenlty, leading ecommerce giant of India: Flipkart witnessed public outrage when their infrastructure failed to scale up as per demand. Though they claim to have 5000 servers ready at per usal but even then they failed to keep the end customer happy. Outburst of customer on social platform like twitter and fb was clearly seen .

As a DevOps, we used to provide on-demand instances using chef-server and aws auto-scaling prior to witnessing miracles of Docker. Whenever a new instance was needed auto-scaling spinned up a new server using pre-configured ami. When server boots up, upstart script run chef-client to pull latest files and configuration required to run the application. This all took 5 to 6 minutes typically. With docker now it takes 1-2 minutes or less, which is a big win.
Maintainable Deployment Scripts

Docker provide Dockerfile for building image. These images when run are called “containers”. Whether you want to install nginx or a framework like ror , everything you can define in Dockerfile in simple to learn and understand dsl.

Best part is you can share your images with each other. If you want mysql image, you can simple pull mySQL image from docker hub. Best part is, write once, deployed many times.
SysAdmin chores made easy and tidy
- Backups
- Logging
- Remote access
- Moving containers around
- etc etc
With mountable volumes, such chores are made very easy. Just spin up your cleaning container using volumes from application containers and with separation-of-concerns you got a tidy way of cleaning and backups of your application.

Docker comes as a boon to chaotic, cluttered life of sysAdmin by making everything easy to handle, maintain and share.

With the incorporation of one tool in the production not only entire vertical in the organisation get the benefits but also end customers

Fast change roll out & deployment
Less number of server spins therefore, reduction in mothly
Easier to write and create Dockerfile leads to short learning curve for new member in the team
High availability of service keeps customer confident intact.

With Docker there is a win-win for all stakeholders.

Why docker? was originally published by Sonu K. Meena at Full Stack Story on November 09, 2014.

RubyOnRails App On Docker: Part-III Making it Robust, final conclusion

2014-10-07T02:42:48+00:00

RubyOnRails App On Docker: Part-III Making it Robust, final conclusion

At this point i assume you have containerized RoR app up and running. You can stop and start container at a breeze. Here, we’ll try to answer Scale and HA questions for each of our components one by one.

Database
- SCALE & HIGH AVAILABILITY
  
  You can configure mysql replicas. There are two configuration, of which one you can choose based on your need.
  1. Master-Master
    
    Choose this if you have more writes(update, insert) than reads(select) operations.
  2. Master-Slave
    
    This is useful if you have more read(select) operations
  (Setting up these configuration is out of scope of this article.)
- Backup & Restore
  
  You can use mysql_dump Or mysqlhotcopy , if tables are MyISAM. There is one awesome article on backup & restore: How To Backup MySQL Databases on an Ubuntu VPS. I urge you to read it.
- Logs management
  
  logs are created under /var/log directory. You can use syslog to ship them to a centralized location, or better if you have ELK stack setup that will ship logs to elasticsearch database then you can create a nice visualization for team to look at.
Application
- SCALE
  
  Now you can launch ror containers.Each container is running at different port eg. 49172, 49173 etc. If traffic increase, you want to scale your infrastructure by launching more similar instances. Here comes LoadBalancer into picture.
  
  On launching new container, just make its entry in this loadbalancer configuration. And when container goes down, take off the entry. There is a tool that can help you out with this chore: BackendUpdater. Basically, it listen to docker events and accordingly update your loadbalancer configuration file followed by configuration reload (nginx reload). More about it in next post.
- High Availability
  
  For maintenance or upgrade you can take out containers and do your upgrade. When done, bring them back. One by one you can do upgrade, thereby performing canary testing.
  
  Also, this enable you zero downtime deployment of your upgrades.

Why do we need a loadbalancer?

Loadbalancer : Loadbalancers are one or more servers that forward the traffic to our backend application servers.
Loadbalancer is required to distributed load among servers. Load distribution is required if there is huge traffic hitting your service and your single server is not able to withstand it.
If you’re running some resource (cpu, ram etc) intensive jobs, like image processing or rendering then also you need a way to distribute load to other idle or lightly utilized servers.

Loadbalancer is not only used to distribute load only but also to ease deployment with minimal downtime. Just take one server out of pool of servers and do your maintenace or upgradation there. When you are done, put it back online. After your canary testing you can perform the same for other servers as well

So, here in RoR application we’re assuming our application to go incredibly famous.Thereby, bringing thousands hits per second. So, to accomodate these many request we’ll put loadbalancers.

...                  +-------+                                           
                     |       |                                           
                     |  L    |        +-+---------+---+                  
                     |  O    |        | +---------+   |                  
T T T   +------>     |  A    |        | | myApp|01|   |                  
R R R                |  D    |        | +------+--+   |                  
A A A   <------+     |  B    +-------->               |        +--------+
F F F                |  A    |        |   +------+--+ |        |        |
F F F   +------>     |  L    <--------+   | myApp|02| | +--->  |DATABASE|
I I I                |  A    |        |   +------+--+ | <---+  |        |
C C C   <-------     |  N    |        |               |        +--------+
                     |  C    |        | +------+--+   |                  
                     |  E    |        | | myApp|N |   |                  
                     |  R    |        | +---------+   |                  
                     |       |        +-+---------+---+                  
                     +-------+

loadbalancer

# Set your server  
# server_name www.example.com;  

upstream containers {

    # Add a list of your application servers
    # Each server defined on its own line
    # Example:
    # server IP.ADDR:PORT fail_timeout=0;
	
    server 127.0.0.1:49172 fail_timeout=0;
    server 127.0.0.1:49173 fail_timeout=0;

}

server {

    # Port to listen on
    listen 80;

    location / {
        # Set proxy headers        
        proxy_set_header        Host $host;
        proxy_set_header        X-Real-IP $remote_addr;
        proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;

        proxy_pass http://containers;
		
		# Turn on nginx stats
        stub_status on;
    }
}

- Backup ( No restoring here) & Logs management
  
  Since application is stateless, & all information is stored in Database. But still we sometime need to take backups for logs or other items. Here comes role of volumes into picture.
  
  In dockerfile, we’d mentioned mountable directories.

# Define mountable directories.
VOLUME ["/etc/dailyReport", "/var/log/dailyReport", "/etc/nginx/sites-enabled", "/etc/nginx/certs", "/etc/nginx/conf.d", "/var/log/nginx"]

Let’s make use of them.

Volumes

volume help us create mountable directories inside containers. They are docker volumes. These are helpful in number of cases, one of which is shipping logs.
Lets have one container that does the job of shipping logs. Separtion of concerns, by having one container that does one job and does it perfectly.
We’ll create new container called data container which will inherits volumes from our app container

$sudo docker run -d --volumes-from app-001.example.com --name shiplogs.example.com myDockerfile/shiplogs

Inside this container /var/log/dailyReport is accessible where app container is squirting log. In our shiplogs container we can have a process that ships logs from there to centralized repository. (What that process could be, is left for future post. )

Similary you can access reverse proxy logs by reading /var/log/nginx. It’s Useful for debugging purpose for figuring out strange behavior of your application.

Another mount directory is /etc/dailyReport . This one is created to store all configuration files. Why? If you want to edit run.sh, or unicorn.rb or reverse proxy configuration then you can do this without actually re-building image.

	$docker run -it --rm  -v /var/log/dailyReport:/var/log/dailyReport -v /etc/dailyReport:/etc/dailyReport  -v /var/run/mysqld:/var/run/mysqld:ro -p 49173:80 --restart="always" -e "RAILS_ENV=production" --name app-001.example.com myDockerfiles/dailyreport /bin/bash

Here i’ve mounted /var/log/dailyReport and /etc/dailyReport directory of host onto container. This will make the container to use my configuration files stored at /etc/dailyReport. Also, i can see the logs created on host directory for debugging purpose.

Conclusion

To recap everything, we’ve containerized RoR application and we’re running one or more instance of it behind loadbalancer. I’ve also tried my best to answer scale, availability and fault tolerance problem related to each component. So, with this i close my article.
Comments are most welcome. If you’ve have any query or have better suggestion, you can write down in comments.

Later on, i’ll pen down centralized logging solution with ELK stack and my experience while working on it. So, stay tuned. ;)

RubyOnRails App On Docker: Part-III Making it Robust, final conclusion was originally published by Sonu K. Meena at Full Stack Story on October 07, 2014.

RubyOnRails App on Docker: Part-II Containerizing App

2014-10-01T09:42:48+00:00

RubyOnRails App On Docker: Part-II How are we doing?

Assumption:

You’re free to name your application and docker namespace anything you want. However, to make this article more readable, i’m using undersigned names.

application name: `dailyReport`
docker user namespace:  `myDockerfiles`

Index

Setup and Install database
Containerize RoR Apps
Setup Reverse Proxy using Nginx

Setup and Install database

For this application we’ll use MySQL. There are two ways to run MySQL

Run without container
Run inside Docker

Q. Any performance impact when runng inside Container?

In docker, cpu performance is native, disk latency is native, memory is not native but could be made. Same is true for network latency.

Nowadays hardware are cheap but software are costly. So, we don’t need to worry about little memory that Docker keep aside. If you really want to squash every single drop, then there are ways to do so. Network latency also can be made as fast as of native. This small minor reduction we can bear.

So, having compromised with Memory and Network latency, we’re proceeding to Dockerize MySQL instance.

Luckily, there is already mysql Dockerfile ready in the docker hub: MySQL Dockerfile

dockerfile/mysql

#
# MySQL Dockerfile
#
# https://github.com/dockerfile/mysql
#

# Pull base image.
FROM dockerfile/ubuntu

# Install MySQL.
RUN \
  apt-get update && \
  DEBIAN_FRONTEND=noninteractive apt-get install -y mysql-server && \
  rm -rf /var/lib/apt/lists/* && \
  sed -i 's/^\(bind-address\s.*\)/# \1/' /etc/mysql/my.cnf && \
  sed -i 's/^\(log_error\s.*\)/# \1/' /etc/mysql/my.cnf && \
  echo "mysqld_safe &" > /tmp/config && \
  echo "mysqladmin --silent --wait=30 ping || exit 1" >> /tmp/config && \
  echo "mysql -e 'GRANT ALL PRIVILEGES ON *.* TO \"root\"@\"%\" WITH GRANT OPTION;'" >> /tmp/config && \
  bash /tmp/config && \
  rm -f /tmp/config

# Define mountable directories.
VOLUME ["/etc/mysql", "/var/lib/mysql"]

# Define working directory.
WORKDIR /data

# Define default command.
CMD ["mysqld_safe"]

# Expose ports.
EXPOSE 3306

Dockerfile is quite simple. Isn’t it? In starting few lines, we’re installing mysql server and granting user root all privileges.

Line 24, however, need little elboration. Data directory will enable direct access to configuration and data files. This I’ll answer in my part-III. For now, lets put parts rolling.

Let’s setup and run mysql

#Pull and Run mysql image
sudo docker run -d --name mysql -p 3306:3306 dockerfile/mysql

This one line is suffice to run mysql server up and running. To verify , we’ll start mysql client using the same image but different command.

	sudo docker run -it --rm --link mysql:mysql dockerfile/mysql bash -c 'mysql -h $MYSQL_PORT_3306_TCP_ADDR'

How to connect our application with database?

There are actually two way to specify mysql connection to our app.

mouting default mysql unix sock

This is useful if you don’t want to run mysql publicly. And for this tutorial, i’ve done the same. I had database installed on my server. So, I’ll mount /var/run/mysqld on my container, thereby enabling rails to find default mysql endpoint to connect with.
```
  docker run -d   -p 49172:80  -v /var/run/mysqld:/var/run/mysqld:ro  --restart="always" -e "RAILS_ENV=production"  myDockerfiles/dailyreport
```
Specify connection string

If you want to use database running somewhere accessible through IP/Port, then you can specify these connection string in an environment variable DATABASE_URL
```
  docker run -d   -p 49172:80  --restart="always" -e "RAILS_ENV=production"  -e "DATABASE_URL='mysql2://username:password@IP/DB_NAME"  myDockerfiles/dailyreport
```

Containerize RoR Apps

RoR framework already comes with sensible default best practices of Software Development. However, there are few configuration that i’d like to stress on:

Session Storage

Store session information in database. This will enable our app to behave more like stateless app. Also, this is essential if we want to scale our infrastructure further
Secrets

Database configuration, RoR Secret key (SECRET_KEY_BASE) , environment , smtp credentials, or other 3rd party addons secret that your app might be using, should not be hardcoded in configuration file. Instead, they should be picked from environment.

Here’s one example showing database credentials being picked from environment.

config/database.yml

production:
  <<: *default
  database: <%= ENV['DATABASE_PROD'] %>
  username: <%= ENV['DATABASE_USERNAME'] %>
  password: <%= ENV['DATABASE_PASSWORD'] %>

Similary, we’ll specify SMTP parameters. If you’re using any 3rd party service, like (mailgun, aws secrets etc), credentails should not be hardcoded, rather should be set in environment for the process to pick while running.

For setting up database credentials in environment, there’s a shorthand provided by RoR:

 DATABASE_URL="mysql2://myuser:mypass@localhost/somedatabase"

NOTE: Setting DATABASE_URL environment variable will take precendence over config file params, & merge with config files to populate db connection setting.

App Server for RoR: Unicorn

We’ll choose widely adopted Unicorn as our web server.

unicorn.rb

# Set the working application directory
# working_directory "/path/to/your/app"
working_directory "/opt/dailyReport"
 
# Unicorn PID file location
pid "/var/run/unicorn.pid"
 
# Path to logs
stderr_path "/var/log/dailyReport/unicorn.err.log"
stdout_path "/var/log/dailyReport/unicorn.log"
 
# Unicorn socket
listen "/tmp/unicorn.dailyReport.sock"
 
# Number of processes
## Rule of thum: 2x per core
worker_processes 2
 
# Time-out
timeout 30

Web Server for RoR: Nginx

As the rail guides says, best practice to serve assets is through nginx. Here nginx will also serve as reverse proxy by masking unix socket and giving illusion of app running at http port.

To make assets serving faster, we’ll gzip our styelsheets and javascripts. How? This is not in the scope of this article, however in rails simple rake assets:precompile command does the trick. Our web server will have assets block that will server these compressed files.

upstream app {
    # Path to Unicorn SOCK file, as defined previously
    server unix:/tmp/unicorn.dailyReport.sock fail_timeout=0;
}

server {
    listen 80;
    server_name localhost;

    # Application root, as defined previously
    root /opt/dailyReport/public;

    try_files $uri/index.html $uri @app;

    location @app {
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $http_host;
        proxy_redirect off;
        proxy_pass http://app;  # point to our upstream server list
    }
	
	#Server compressed assets
	location ~ ^/(assets)/  {
	  gzip_static on; # to serve pre-gzipped version
	  expires max;
	  add_header Cache-Control public;
	}
	
    error_page 500 502 503 504 /500.html;
    client_max_body_size 4G;
    keepalive_timeout 10;
} 

How will we start our app?

Here comes run.sh into picture. We’ve written our startup script in this file.

run.sh

RAILS_ENV=$RAILS_ENV
: ${RAILS_ENV:="development"}
export RAILS_ENV

SECRET_KEY_BASE=$SECRET_KEY_BASE
: ${SECRET_KEY_BASE:="f38c575fcf0a2b0e7c7f002a873d54d78104581ebe069bf2b1afad04014d1e10245b259b872b0e12189ef2ce3fca4c73a9b5103aaf4aad1f4"}
export SECRET_KEY_BASE=$SECRET_KEY_BASE

## Setting DB
DB_NAME="dailyReport_${RAILS_ENV}"
#DATABASE_URL="mysql2://root:root@localhost/${DB_NAME}"

# Trap sigkill and sigterm: otherwise dockr stop/start will complain for stale unicorn pid
trap "pkill unicorn_rails ; exit " SIGINT SIGTERM SIGKILL

echo "Stopping  unicorn_rails, if already running"
pkill unicorn_rails

echo "cleaning tmp files"
rm -rf tmp/*

echo "Restart Reverse Proxy"
service nginx restart

echo "Running unicorn"
bundle exec unicorn_rails -c /etc/dailyReport/unicorn.rb -E $RAILS_ENV -d

let’s wrap these lines inside run.sh file which will serve as our app startup script.

We wrote run.sh,and unicorn.rb file. We’ve also replaced hardcoded database, SMTP and 3rd party credentials with environment variables. Now now need to wrap our app and unicorn in a container.

For this we’ll create two dockerfiles

Base Dockerfile

It’ll contain latest version of ruby and rails installed. Since ruby 1.9.x EOL is near. The newer 2.1.x version is comparatively fast, and bug free. So, we’ll use ruby 2.1.2 and we’ll get it installed by rbenv. It’ll also help us to update ruby version without re-building docker image again. how?

# Note down its container id
docker run -it baseDockerImage /bin/bash
	$ rbenv install ruby 2.1.3
	$ exit
docker commit -m "ruby2.1.3" CONTAINER_ID  

Main Dockerfile

This will be our Dockerfile for our application. It’ll include ‘unicorn.rb’, ‘run.rb’ and ‘reverse proxy’ configuration. Basically, everything that’s required to run ror app natively.

myDockerfiles/base_ruby

#
# Ruby with rbenv Dockerfile
#

# Pull base image.
FROM dockerfile/ubuntu	

# Install some dependencies
RUN apt-get update
RUN apt-get install -y git-core curl zlib1g-dev build-essential libssl-dev libreadline-dev libyaml-dev libsqlite3-dev sqlite3 libxml2-dev libxslt1-dev libcurl4-openssl-dev python-software-properties

# Install rbenv to install ruby
RUN git clone git://github.com/sstephenson/rbenv.git /usr/local/rbenv
RUN echo '# rbenv setup' > /etc/profile.d/rbenv.sh
RUN echo 'export RBENV_ROOT=/usr/local/rbenv' >> /etc/profile.d/rbenv.sh
RUN echo 'export PATH="$RBENV_ROOT/bin:$PATH"' >> /etc/profile.d/rbenv.sh
RUN echo 'eval "$(rbenv init -)"' >> /etc/profile.d/rbenv.sh
RUN chmod +x /etc/profile.d/rbenv.sh

# Install rbenv plugin: ruby-build
RUN mkdir /usr/local/rbenv/plugins
RUN git clone https://github.com/sstephenson/ruby-build.git /usr/local/rbenv/plugins/ruby-build

# Let's not copy gem package documentation
RUN echo "gem: --no-ri --no-rdoc" > ~/.gemrc

ENV RBENV_ROOT /usr/local/rbenv
ENV PATH $RBENV_ROOT/bin:$RBENV_ROOT/shims:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

# Install ruby
RUN rbenv install 2.1.2
RUN rbenv local 2.1.2
RUN rbenv global 2.1.2



## Install Rails
RUN apt-get install -y software-properties-common
RUN add-apt-repository ppa:chris-lea/node.js
RUN apt-get update
RUN apt-get install -y nodejs

## Finally, install Rails
RUN gem install rails
RUN rbenv rehash

CMD /bin/bash

Let’s build and tag it

docker build -t "myDockerfiles/base_ruby" .
# Run and test
docker run -it --rm  myDockerfiles/baseRubyImg /bin/bash -c 'ruby -v'

Here comes our main app Dockerfile that we will use

myDockerfiles/main

#
# dailyReport in Container
#
 
# Pull base image.
FROM  myDockerfiles/base_ruby
 

# Fill dependencies for mysql2 gem
RUN apt-get install -y libmysqlclient-dev libmysqlclient18 ruby-dev
 
# Install Nginx.
RUN \
  add-apt-repository -y ppa:nginx/stable && \
  apt-get update && \
  apt-get install -y nginx && \
  rm -rf /var/lib/apt/lists/* && \
  chown -R www-data:www-data /var/lib/nginx

# Pull repository from private github repos
 
### Create .ssh dir in home directory
RUN mkdir -p /root/.ssh
# Add your private key here. (Create a separate key, so that you can revoke it later)
ADD ./id_rsa /root/.ssh/id_rsa
RUN chmod 700 /root/.ssh/id_rsa
RUN echo "Host github.com\n\tStrictHostKeyChecking no\n" >> /root/.ssh/config
 

# Setup Reverse Proxy : Add reverse proxy config here
ADD ./dailyReport_nginx.conf /etc/nginx/sites-enabled/default
RUN service nginx reload && service nginx restart

 
WORKDIR /opt/dailyReport

# Pull project : Replace with your github handle and repository
RUN git clone git@github.com:sahilsk/dailyReport.git .
 
# Install gem
RUN gem install bundler
RUN bundle install
RUN rbenv rehash

# Pre-compile app production assets
RUN RAILS_ENV=production bundle exec rake assets:precompile
 
# Add unicorn config here 
ADD ./unicorn.rb /etc/dailyReport/unicorn.rb

# Run script
ADD ./run.sh /etc/dailyReport/run.sh


# Define mountable directories.
VOLUME ["/etc/dailyReport", "/var/log/dailyReport", "/etc/nginx/sites-enabled", "/etc/nginx/certs", "/etc/nginx/conf.d", "/var/log/nginx"] 

# Expost port 80
EXPOSE 80

# Set environment variables
ENV RAILS_ENV development

#
CMD /bin/bash /etc/dailyReport/run.sh

Let’s build out main app now.

$ docker build -t myDockerfiles/dailyreport .

It all goes well, you now have two images built succesfully.

Having build two images, you can see them using docker commands

$ docker images

Now lets run our app container

$docker run -d -p 49172:80  -v /var/run/mysqld:/var/run/mysqld:ro  --restart="always" -e "RAILS_ENV=production"  myDockerfiles/dailyreport

You can visit localhost:49172 and confirm if your app is launched

You can run more than one instance. Simple change the port and execute.

$docker run -d -p 49173:80  -v /var/run/mysqld:/var/run/mysqld:ro  --restart="always" -e "RAILS_ENV=production"  myDockerfiles/dailyreport

References:

Dockerfile and scripts used here are on github

RubyOnRails App on Docker: Part-II Containerizing App was originally published by Sonu K. Meena at Full Stack Story on October 01, 2014.

RubyOnRails App on Docker: Part-I Understanding Specs

2014-09-27T09:42:48+00:00

RubyOnRails App On Docker: Part-I What are we doing?

In this post, I’ll try to pen down steps to deploy RubyOnRails using Docker. Before i begin, i assume readers to have basic understanding of Docker and Dockerfile. Albeit i’m trying to keep my post generic, independent of application layer technology, but having little RoR knowledge will help them get best out of this article.

Docker is all the rage nowadays. Though linux containers are there in linux for many years, but their real potential was sighted by Jérôme Petazzoni. Being a nescent technology doesn’t stop CTO from rolling it out on their Production. Runnable, NewRelic, Shippable etc, are all living on the edge and using Docker in their Day-to-day production as well as development work.

What influence CTO Decision?

As a CTO, you need to think on wide perspective before adopting any new technology. Scalability, High Availability, Downtime, skills availability in team, time to learn etc etc. To put in simple terms, its not easy for new technology to come in limelight so easily. Technologies like HAProxy, Zookeeper are all battle tested and proven ones. But same is not true for Docker.

However, wide early adoption of Docker by many silicon valleys lean startups has debunked this fact. Their stories has inspired many others. You can read one from NewRelic here

Recently, Amazon web service have started giving Docker container support. Recent Docker v1.2.0 release comes with enhanced security features that further support Docker acceptance in Production. RedHat collaboration with Docker further emphasize that Docker is ready for production.

Coming to the post, let me paraphrase the title of this article:

Deploying RoR app using Docker

Let me break down the word “Deployment” as per DevOps dictionary:

Setup and configure Unicorn
Setup and configure Reverse Proxy: Nginx
Setup Database : MySQL
Pre-compile Assets and configure db settings

Wait, there’s more. How will you scale your application? How will you ensure High availability(further onwards as HA) of your service and commit 99.999% uptime to your customers?

Lets give them a short visit here:

SCALABILITY

If you application is based on 12 factors commandment , then your app is likely to be scalable. If you don’t know these 12 factors, i strongly suggest you to visit the site and skim it in one go.

Expanding further, Apps can be of two type:
- Stateless
  
  Stateless applications are easy to scale up and down. If it’s 12factor app then scalability is as easy as spinning up/down more instances. 12 factors commandment make scalability a breeze.
- Stateful
  
  Stateful application, like Databases, are comparatively difficult to scale. However, some database does provide clustering and sharding, out of the box. You’ll like to consider this criteria while making DBMS decision for your app.
HIGH AVAILABILITY

With the advancement in technology, Zero Downtime deployment is no longer a dream. If your infrastructure does not yet support zero downtime deployment, then indeed you’re living in rock age. 12 factors app also enable you to avail zero downtime deployment.

Finally, some sys-admin chores:

Backup & Restore

Backup database data, or configuration files periodically and ship them to a safe place (Centralized server or S3 Buckets )
Logs Management

Logs are no longer neglected in today’s age of Big-Data. Management need information to aid their decisions. Logs help them provide those inputs. These inputs can be Geographical, or User browsing or buying trend, all gathered from logs.

Now, having knowns the full requirement, we’ll visit them one by one in this series of articles divided in three different parts.

Part-I : What are we doing?

Understanding the full deployment requirement.
Part-II : How are we doing?

This part will include:
- Setup and Installation of Database
- Containerize RoR App with webserver
- Reverse Proxy : Why and How?
Parth-III : Conclusion

In this last article I’ll revisit this post to answer our deployment requirements.

RubyOnRails App on Docker: Part-I Understanding Specs was originally published by Sonu K. Meena at Full Stack Story on September 27, 2014.

Hello World

2014-09-03T03:00:00+00:00

You’ll find this post in your _posts directory - edit this post and re-build (or run with the -w switch) to see your changes! To add new posts, simply add a file in the _posts directory that follows the convention: YYYY-MM-DD-name-of-post.ext.

Sample Heading

Sample Heading 2

Jekyll also offers powerful support for code snippets:

def print_hi(name)
  puts "Hi, #{name}"
end
print_hi('Tom')
#=> prints 'Hi, Tom' to STDOUT.

Check out the Jekyll docs for more info on how to get the most out of Jekyll. File all bugs/feature requests at Jekyll’s GitHub repo.

Hello World was originally published by Sonu K. Meena at Full Stack Story on September 03, 2014.

Full Stack Story

Adoption of Git backed automation Workflows: GitOps

Git Branch Policy

Extras

Benefits

Site-to-site Vpn Setup on Aws

How-to guide on setting up site-to-site vpn across regions.

IPSec comes into picture

How are we doing?

Installing Openswan on CentOS 7

Configure openswan

Singapore Stage VPC:

Mumbai VPC

oh wait.. we have two vpc running in singapore region: Stage and Dev. How do we connect Dev VPC also with mumbai VPC ?

Can’t we use the same openswan public server that we’ve launched in stage VPC?

To have singapore dev vpc connectivity with mumbai VPC

Singapore Dev VPC

Mumbai VPC

Modify route table

Mumbai VPC Route table

Test Connectivity

Conclusion

Related articles

FaaS for Ec2 Auditing

Difference between AWS Lambda vs Webtask.io

What are we doing?

How are we doing?

Imlementation: show me the code

Here is the result

How to Write a Cron?

How to write a good cron?

Q. What cron should do?

Q. What cron should not do?

Other important considerations:

Logging

Alerting

FAQ

Q. How to know if my cron run successfully or not?

Q. How to get alert on failed cron run?

Q. What if somebody removed shift/removed cron from server?

Q. How to check if cron run on particular day or not?

Q. Shouldn’t devops monitor cron for us?

How to Write a Good Worker?

How to write a good worker?

Logging

Retry->Reque->Die

Alert

How to create a Good AMI?

What is Amazon Machine Image(AMI)?

Benefits of using AMI

How to create a good AMI?

Check List:

Delete Shell history

Clean authorized keys

References:

So You're Writing Api Client

Starting with errors first

Enriching response

Final response block

References

Hands on Facebook #reactjs

First Impression

Small learning Curve

Tidy manageable code

Cool stuff

What next?

Conclusion

MongoDB Standalone Server in Docker

MongoDB 2.4.5 inside Docker

Dockerfile

Build

RUN

Lets do some benchmarking

Database in Docker

So, Should i dare to run database in Docker?

Application Monitoring Dashboard Solution

What are these metrics that we should be wary about?

Next Step is, how to get these metrics?

How to create a monitoring Dashboard ?

Little on choosing technology for building Dashboard