Monday 14 October 2019

Terraform in 10 mins


Terraform is  "infrastructure-as-code" and there is a list of many tools over the internet to do your job:

  • Puppet
  • SaltStack
  • Chef
  • Ansible
  • CloudFormation

Terraform can manage existing and popular service providers as well as custom in-house solutions. Configuration files describe to Terraform the components needed to run a single application or your entire datacenter.

Setup:

Download terraform from -> Terraform

Follow steps given in the link to setup your machine -> Install


First Terraform Script:

Create new file example.tf, here .tf is extension of terraform files

If you don't have an AWS account, create one now. I will be using resources which qualify under the AWS free-tier, meaning it will be free. 

If you already have an AWS account, you may be charged some amount of money, but it shouldn't be more than a few rupees at most.

once you have created new AWS account then go ahead and install AWS-CLI from -> Install AWS-CLI



As you have install AWS-CLI now configure your AWS environment
open command prompt and enter command.


 aws configure


$ aws configure

AWS Access Key ID [None]: <your Access key ID>

AWS Secret Access Key [None]: <your secret access key>

Default region name [None]: <your region>


Default output format [None]:



Use below code to create your first EC2 instance :


EC2 instance :

provider "aws" {
  profile    = "default"
  region     = "ap-south-1"
}

resource "aws_instance" "my_terraform_instance" {
  ami           = "ami-01469ab6166957456"
  tags {Name="Rakesh"}
  instance_type = "t2.micro"
}

NOTE : You need to change ami as per your region
you can find region and ami details from here -> AMI Locator


Now we are ready to create our first EC2 instance via terraform.

Run below command from the location of  example.tf file 

E:\home\Cloud_project\stage>terraform init

And you might encounter with below error 

Terraform initialized in an empty directory!

The directory has no Terraform configuration files. You may begin working
with Terraform immediately by creating Terraform configuration files.

To resolve this :
  • Make sure you are running command from the folder where your .tf file is located.
  • Make sure your file name extension is correct.
  • Make sure you have a valid internet connection and you are not on VPN.


E:\home\Cloud_project\stage>terraform init


Deploy EC2:

E:\home\Cloud_project\stage>terraform apply

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  + aws_instance.my_terraform_instance
      id:                           <computed>
      ami:                          "ami-01469ab6166957456"
      arn:                          <computed>
      associate_public_ip_address:  <computed>
      availability_zone:            <computed>
      cpu_core_count:               <computed>
      cpu_threads_per_core:         <computed>
      ebs_block_device.#:           <computed>
      ephemeral_block_device.#:     <computed>
      get_password_data:            "false"
      host_id:                      <computed>
      instance_state:               <computed>
      instance_type:                "t2.micro"
      ipv6_address_count:           <computed>
      ipv6_addresses.#:             <computed>
      key_name:                     <computed>
      network_interface.#:          <computed>
      network_interface_id:         <computed>
      password_data:                <computed>
      placement_group:              <computed>
      primary_network_interface_id: <computed>
      private_dns:                  <computed>
      private_ip:                   <computed>
      public_dns:                   <computed>
      public_ip:                    <computed>
      root_block_device.#:          <computed>
      security_groups.#:            <computed>
      source_dest_check:            "true"
      subnet_id:                    <computed>
      tags.%:                       "1"
      tags.Name:                    "Rakesh"
      tenancy:                      <computed>
      volume_tags.%:                <computed>
      vpc_security_group_ids.#:     <computed>


Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

aws_instance.my_terraform_instance: Creating...
  ami:                          "" => "ami-01469ab6166957456"
  arn:                          "" => "<computed>"
  associate_public_ip_address:  "" => "<computed>"
  availability_zone:            "" => "<computed>"
  cpu_core_count:               "" => "<computed>"
  cpu_threads_per_core:         "" => "<computed>"
  ebs_block_device.#:           "" => "<computed>"
  ephemeral_block_device.#:     "" => "<computed>"
  get_password_data:            "" => "false"
  host_id:                      "" => "<computed>"
  instance_state:               "" => "<computed>"
  instance_type:                "" => "t2.micro"
  ipv6_address_count:           "" => "<computed>"
  ipv6_addresses.#:             "" => "<computed>"
  key_name:                     "" => "<computed>"
  network_interface.#:          "" => "<computed>"
  network_interface_id:         "" => "<computed>"
  password_data:                "" => "<computed>"
  placement_group:              "" => "<computed>"
  primary_network_interface_id: "" => "<computed>"
  private_dns:                  "" => "<computed>"
  private_ip:                   "" => "<computed>"
  public_dns:                   "" => "<computed>"
  public_ip:                    "" => "<computed>"
  root_block_device.#:          "" => "<computed>"
  security_groups.#:            "" => "<computed>"
  source_dest_check:            "" => "true"
  subnet_id:                    "" => "<computed>"
  tags.%:                       "" => "1"
  tags.Name:                    "" => "Rakesh"
  tenancy:                      "" => "<computed>"
  volume_tags.%:                "" => "<computed>"
  vpc_security_group_ids.#:     "" => "<computed>"
aws_instance.my_terraform_instance: Still creating... (10s elapsed)
aws_instance.my_terraform_instance: Still creating... (20s elapsed)
aws_instance.my_terraform_instance: Still creating... (30s elapsed)
aws_instance.my_terraform_instance: Creation complete after 38s (ID: i-0113dba004f0cdb16)

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.


Voila..!!

you have created your first EC2 instance via terraform.
Now login into your AWS account and check your newly created EC2 instance running.


Clean up:

destroy your services via below command.

E:\home\Cloud_project\stage>terraform destroy






Thursday 5 September 2019

Big O calculation rules


There are rules which helps us to identify Big O notations.

Rule 1 :- Worst case
Rule 2 :- Remove Constants
Rule 3 :- Different terms for input
Rule 4 :- Drop Non dominants



Rule 1:

The very first role when it comes to big-O that is worst case when calculating big.

if you look at above function we're looping through the entire array to find "rakesh".

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
package com.algo.practise;

import java.util.concurrent.TimeUnit;

public class Performance {
 public static void main(String[] args) {
  String ar[] = { "rks", "rk", "rakesh", "singhania", "kumar" };
  long timeMilli1 = System.nanoTime();

  for (int i = 0; i < ar.length; i++) {
   System.out.println("Loop");
   if (ar[i].equals("rakesh")) {
    System.out.println("found");    
   }
  }

  long timeMilli2 = System.nanoTime();
  long durationInMs = TimeUnit.MICROSECONDS.convert(timeMilli2 - timeMilli1, TimeUnit.NANOSECONDS);
  System.out.print(durationInMs + " MICROSECONDS");
 }
}

output:
1
2
3
4
5
6
7
Loop
Loop
Loop
found
Loop
Loop
217 MICROSECONDS

Well "rakesh" was only the third member in this array. And when we run this function we found "rakesh".

But the funny thing is this function ran 5 times not 3 times. We already found "rakesh" then why to run 5 times.

We can make this function a little bit more efficient . If a condition is met we just break out of this loop.


Rule 2 :

The second role when it comes to big-O that is remove constants when calculating big.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
package com.algo.practise;

import java.util.concurrent.TimeUnit;

public class Performance {
 public static void main(String[] args) {
  String ar[] = { "rks", "rk", "rakesh", "singhania", "kumar" };
  long timeMilli1 = System.nanoTime();

  for (int i = 0; i < ar.length; i++) {
   System.out.println("Loop");
   if (ar[i].equals("rakesh")) {
    System.out.println("found");
    return;
   }
  }

  long timeMilli2 = System.nanoTime();
  long durationInMs = TimeUnit.MICROSECONDS.convert(timeMilli2 - timeMilli1, TimeUnit.NANOSECONDS);
  System.out.print(durationInMs + " MICROSECONDS");
 }
}

Preview:
1
2
3
4
Loop
Loop
Loop
found

Well when it comes to big O.

Big-O only cares about the worst case ,Well the worst case is that "rakesh" is at the very end.

So even if we have this break statement we're still going to run as 10 times because "rakesh" is at the end.



 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

33
package com.algo.practise;

import java.util.Arrays;
import java.util.concurrent.TimeUnit;

public class Performance {
 public static void main(String[] args) {
  String ar[] = { "rks", "rk", "rakesh", "singhania", "kumar" };
   String[] large=new String[1000];
  Arrays.fill(large, "rakesh");

  // Returns current time in millis
  long timeMilli1 = System.nanoTime();
  for (int i = 0; i < ar.length; i++) {
   if (ar[i].equals("rakesh")) {
    System.out.println("found");
   }
  }
    
  for (int i = 0; i < ar.length; i++) {
   if (ar[i].equals("rk")) {
    System.out.println("found");
   }
  }
                for (int i = 0; i < 100; i++) {
          System.out.println("found");
  }
 }
}

here Big O is O(2n+100)

so we can ignore constants like 100 and 2 as an array gets bigger and bigger we don't care about adding an extra 100 because if n is a million adding an extra hundred on there another 100 steps doesn't really matter.


So our Big O results in O(n) after dropping constants .


Rule 3:

That is different terms for inputs ,So let's look at an example.


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
package com.algo.practise;

import java.util.Arrays;

public class Performance {
 public static void main(String[] args) {
  String ar[] = { "rks", "rk", "rakesh", "singhania", "kumar" };
   String[] large=new String[1000];
  Arrays.fill(large, "rakesh");
  
  
  for (int i = 0; i < ar.length; i++) {
   if (ar[i].equals("rakesh")) {
    System.out.println("found");
   }
  }
  
  
  for (int i = 0; i < large.length; i++) {
   if (large[i].equals("rakesh")) {
    System.out.println("found");
   }
  }
  
 }
}

I have the exact same function we saw in Rule 2

We have the array and we just have two loops here.

And as I said before we dropped the constants it becomes all of an O(n) in Rule 2.

but the third rule states that different terms for inputs and what that means is . The first one and then the second one are two different inputs. 

One could be a hundred items long.
Another one can be just one item.

So Big O become O(a+b) where a and b are arbitrary letters.

If we have nested loops then Big O will get changed

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
class Main {
    public static void main(String[] args) {
        System.out.println("Hello World!");
        int[] numbers = {1,2,3,4,5};
        for(int j=0;j<numbers.length;j++){
            for(int i=0;i<numbers.length;i++){
                System.out.println(numbers[i]+","+numbers[j]);
            }
        }
    }
}

it becomes O(a*b).


Rule 4: 

drop non dominance or drop non-dominant terms.

Look at below examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
package com.algo.practise;

public class Performance {
 public static void main(String[] args) {
  int ar[] = { 1,2,3,4,5,6};
  
  
  for (int i = 0; i < ar.length; i++) {
   System.out.println(ar[i]);
  } // O(n)
  
  System.out.println("Sum of number");
  for (int i = 0; i < ar.length; i++) {
   for (int j = 0; j < ar.length; j++) {
    System.out.println(ar[i]+ar[j]);
   }
  }
  //O(n^2)
 }
}


So again just looping over and logging out the numbers and then we have another step in here we're summing the pair .

That is where adding each number one after another.

So if we find out the Big O of this method it comes out to be
O(n) + O(n^2)

By Rule number four we want to drop the non dominant terms.

That means we care about the most important term in this case.

We actually drop the n and just have n^2 .

which results in O(n^2).





Wednesday 4 September 2019

Big O notation


Big-O Analysis of Algorithms
The Big O notation defines an upper bound of an algorithm, it bounds a function only from above.
Don't get too hung up on this. We want to use this as an example to measure how long it takes for this function to run. We can do this in Java by saying let's say Time0 is going to start this timer before the loop happens.
And then when the loop ends I'm going to have another timer called T-1 

package com.algo.practise;

import java.util.Arrays;
import java.util.concurrent.TimeUnit;

public class Performance {
 public static void main(String[] args) {
  String ar[] = { "rks", "rk", "rakesh", "singhania", "kumar" };

  // Returns current time in millis
  long timeMilli1 = System.nanoTime();
  for (int i = 0; i < ar.length; i++) {
   if (ar[i].equals("rakesh")) {
    System.out.println("found");
   }
  }
 
  long timeMilli2 = System.nanoTime();
  long durationInMs = TimeUnit.MICROSECONDS.convert(timeMilli2 - timeMilli1, TimeUnit.NANOSECONDS);
  System.out.print(durationInMs +" MICROSECONDS");
 }
}

Output:
found
136 MICROSECONDS

it's going to give us the results and microseconds
And if I keep running it I see that it takes a little bit longer but only few microseconds only.
And that's because this is really fast right our computers machines are extremely fast in this day.

So instead of just having a single array let's have the array that now has a lot more items 
Let's just call it large and we can create a massive array by just saying fill new array.


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
package com.algo.practise;

import java.util.Arrays;
import java.util.concurrent.TimeUnit;

public class Performance {
 public static void main(String[] args) {
  // String ar[] = { "rks", "rk", "rakesh", "singhania", "kumar" };
  String[] large = new String[1000];
  Arrays.fill(large, "rakesh");
  // Returns current time in millis
  long timeMilli1 = System.nanoTime();

  for (int i = 0; i < large.length; i++) {
   if (large[i].equals("rakesh")) {
    System.out.println("found");
   }
  }

  long timeMilli2 = System.nanoTime();
  long durationInMs = TimeUnit.MICROSECONDS.convert(timeMilli2 - timeMilli1, TimeUnit.NANOSECONDS);
  System.out.print(durationInMs + " MICROSECONDS");
 }
}

output:
1
2
3
4
5
found
found
found
found
19035 MICROSECONDS

that took a lot longer.

What if we had a massive array of 10000.
output :
1
2
3
4
5
found
found
found
found
356321 MICROSECONDS

Well we see that as our input grew our function became slower and slower and slower.

But here's the problem here. If you take this code and run it on your computer while your time is going to be different than mine.

You're going to get frustrated because every time you're on this code is going to be different than my number .It might be a lot faster a lot slower.
Therefore if I call my friend across the world let's call them Anurag and I tell him Hey Anurag my code is so amazing I've created this fine function and it runs in three seconds 1.9 seconds with hundred thousand inputs.

And then Anurag says Ha that's really awesome. But you know what. Mine runs a lot faster runs in 1.5 seconds.

So how can we determine who wins. Do I win or does Anurag win.

Who has better code and this is very common in the computing world. 

Big-O notation is the language we use for talking about how long an algorithm takes to run.

We can compare two different algorithms or in this case functions using big-O and say which one is better




That's all it is.

This is what we call algorithmic efficiency big-O allows us to explain this concept.

Remember how in our function we initially had an array of just one which was finding "rakesh".

But then as we increase that array of 100000.

You saw that the number of operations or the number of things we do in the loop increased over and over and different functions have different big-O complexities. 

Just remember when we talk about Big O and scalability of code we simply mean when we grow bigger and bigger with our input.



No matter what we're looping the way that we have this code set up if we have ten items in the array it's going to be ten operations ten loops.

We see a little bit of pattern here. We can draw a line through it.

This is linear rate as our number of inputs increase the number of operations increase as well.

Now we've learned our very first Big-O notation i.e

O(n ) or Linear time -- > N is just a random letter I could put x ,I could put Y in here if I want is just an arbitrary letter.


So let's talk about the next one.

Another very common Big-O notation that you're going to see 
what happens if we have a function like this.


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
package com.algo.practise;

import java.util.Arrays;
import java.util.concurrent.TimeUnit;

public class Performance {
 public static void main(String[] args) {
  String[] large = new String[10000];
  Arrays.fill(large, "rakesh");
  long timeMilli1 = System.nanoTime();

  // O(1)
  System.out.println(large[50]);

  long timeMilli2 = System.nanoTime();
  long durationInMs = TimeUnit.MICROSECONDS.convert(timeMilli2 - timeMilli1, TimeUnit.NANOSECONDS);
  System.out.print(durationInMs + " MICROSECONDS");
 }
}

No matter how many times you are calling the method we're always just grabbing the specific item in the array.

If we look at this with an example if we had an array of names and we run it through the function that just takes the 50th item in the array.

Well the number of operations is one no matter how big the array is We're only doing one thing. So it's a constant time. 

O(1) - constant








Sunday 25 August 2019

Logging in AWS lambda

Problem :

Logs are very important aspect of any microservice or any AWS service and in AWS if your rate of generating Cloudwatch logs is high ,then  it can increase your AWS costs significantly. There should be some limit on the growth of the logs to keep costs under control.

Solution :

There are multiple ways to manage the logging. Let's begin with some of the best practices and then some framework level changes which can help in reducing the overall logging.


Best Practices :


1. Most of the times we are only interested to find error scenarios from the logs so it is important to use log.error while logging error cases. Never use other logging levels for logging errors

2. Never use print statements instead use a proper logging framework to log the statements

2. Be diligent about the log statements and categorize them correctly at different log levels : DEBUG, INFO, ERROR

4. Don't log data unless there is a critical need for same. Even if required, the data should be printed only at DEBUG level 



Framework level changes :


Apart from the best practices listed above, it is also required that we put some controls in place to keep a check on unwanted logging. Here are few services on which I have implemented logging framework .

  •  Lambda functions - Python/Java
  •  ECS services- in Java

It is important that we handle logging from both the services.
Idea is to keep the logging levels at minimum for all the development environments. So for most of the dev/test environments, logging will be kept at ERROR while UAT and Prod may run on INFO but can be increased to DEBUG temporarily for troubleshooting.





1. ECS services

Following steps should be taken to retrofit logging in all ECS components:

1. Add following properties in application.properties:
1
2
3
4
5
logging.level.root=${LOG_LEVEL}
LOG_LEVEL=INFO
#endpoints.loggers.sensitive=false
management.endpoints.web.exposure.include=health,info,loggers
org.springframework.boot:type=Endpoint,name=Loggers


2. Enable Spring Actuator framework by adding following dependencies in pom.xml:

pom.xml:
1
2
3
4
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>


3. Make following changes in CF template of the project :
1. Add security group - Below is a sample config. Rename the variables as per project name and usage :

Preview:
2. HelloComponentSecurityGroup:
3.   Type: AWS::EC2::SecurityGroup
4.   Properties:
5.     GroupName: !Sub "hello-component-${EnvironmentName}-sg"
6.     GroupDescription: ECS security group
7.     SecurityGroupIngress:
8.     - IpProtocol: tcp
9.       FromPort: 8080
10.       ToPort: 8080
11.       CidrIp: 10.0.0.0/8
12.     SecurityGroupEgress:
13.     - IpProtocol: tcp
14.       FromPort: 0
15.       ToPort: 1111
16.       CidrIp: 0.0.0.0/0
17.     VpcId:
18.       Fn::ImportValue: "vpc-id"
19.     Tags:
20.     - Key: Company
21.       Value: !Ref CompanyTag    
22.     - Key: StageName
23.       Value: !Ref StageName
24.     - Key: EnvironmentName
25.       Value: !Ref EnvironmentName
26.     - Key: Name
27.       Value: !Sub "hello-component-${EnvironmentName}-sg"
28. Make sure to include the security group in ECSService block:

29. EcsService:
30.   Type: 'AWS::ECS::Service'
31.   Properties:
32.     Cluster: !Sub ${ClusterName}-${EnvironmentName}
33.     ServiceName: !Sub "${ServiceName}-service"
34.     LaunchType: FARGATE
35.     DesiredCount: !Ref ServiceDesiredCount
36.     TaskDefinition: !Ref TaskDefinition
37.     DeploymentConfiguration:
38.       MaximumPercent: 200
39.       MinimumHealthyPercent: 100
40.     NetworkConfiguration:
41.       AwsvpcConfiguration:
42.         AssignPublicIp: DISABLED
43.         ############## Include the statements below #############
44.         SecurityGroups:
45.           - !Ref HelloComponentSecurityGroup

46. Add another Environment variable in TaskDefinition → Environment :
47. Environment:
48.   -
49.     Name: LOG_LEVEL

    Value: !Ref LogLevel ### (This LogLevel field should be defined in parameters and should be derived through config project)



2. Lambda Services

Let's try to define logging strategy for both Python and Java based Lambda functions:


2.1. Python based lambda

Python has inbuilt logging framework which can be used to log statements at different logging levels (much like in Java). Following steps can be used to enable the same:

1. import logging in main lambda function python file (file which has lambda_handler) and before beginning of the function write following statements:

import logging
logger = logging.getLogger()
logger.setLevel(os.environ['LOGGING_LEVEL'])


2. Use log.info, log.error or log.debug instead of context.log to log statements appropriately.



2.2. Java based Lambda

Unfortunately, for the lambda(s) built in Java, the standard context logging framework does not support different logging levels. So we will have to use aws-lambda-java-log4j framework to make logging inline with other compute resources.

1. Add following dependencies in pom.xml :

pom.xml:
<dependency>
    <groupId>com.amazonaws</groupId>
    <artifactId>aws-lambda-java-log4j2</artifactId>
</dependency>
<dependency>
    <groupId>org.apache.logging.log4j</groupId>
    <artifactId>log4j-core</artifactId>
</dependency>
<dependency>
    <groupId>org.apache.logging.log4j</groupId>
    <artifactId>log4j-api</artifactId>
</dependency>


2.  Paste following file in src/main/resources/log4j.xml:

log4j.xml:
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE log4j:configuration SYSTEM "log4j.dtd">
<log4j:configuration debug="true"
 xmlns:log4j='http://jakarta.apache.org/log4j/'>

 <appender name="file" class="org.apache.log4j.RollingFileAppender">
    <param name="append" value="false" />
    <param name="maxFileSize" value="10KB" />
    <param name="maxBackupIndex" value="5" />
    <!-- For Tomcat -->
    <param name="file" value="${catalina.home}/logs/myStruts1App.log" />
    <layout class="org.apache.log4j.PatternLayout">
  <param name="ConversionPattern" 
   value="%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n" />
    </layout>
 </appender>

 <root>
  <level value="ERROR" />
  <appender-ref ref="file" />
 </root>

</log4j:configuration>

We need to replace context.log statements to log.debug /log.info / log.error statements as applicable. But before we use log statements, we need to define the logger for the given class:

private  static  final  Logger log = LogManager.getLogger(Hello.class);
.... 
log.error("Error:  Exception processing input"  + e.getMessage(), e);







Spring boot with CORS

CORS (Cross-Origin Resource Sharing) errors occur when a web application running in a browser requests a resource from a different domain or...