Steps to setup an Opensource Real-time IoT data pipeline in Azure cloud

Architecture diagram:

Through this and a subsequent series of articles I would like to layout a detailed step-by-step process for building a realtime IoT data pipeline based primarily on Apache open source technologies on Azure cloud.

I start with the detailed architecture and instructions to build production grade clusters.

In the subsequent article we shall see how the various processors provided by Nifi help us string together each of the tools in the above table and help us debug the data ingestion, processing and storage process using its data provenance feature.

Installing Cloudbreak on Azure:

Refer to https://docs.cloudera.com/HDPDocuments/Cloudbreak/Cloudbreak-2.8.0/install-azure/content/cb_vm-requirements.html

Step 1:

Spawn a CentOS 7 server in Azure cloud with 16GB RAM, 40GB disk, 4 cores

Do a sudo -i to execute all the following commands as root.

yum -y update

systemctl enable firewalld

systemctl start firewalld

systemctl status firewalld

setenforce 0 sed -i ‘s/SELINUX=enforcing/SELINUX=disabled/g’ /etc/selinux/config

getenforce (The command should return “Disabled”.)

Step 2:

yum install -y docker

systemctl start docker

systemctl enable docker

Incase you face problems installing docker using the above commands check:

https://github.com/NaturalHistoryMuseum/scratchpads2/wiki/Install-Docker-and-Docker-Compose-(Centos-7)

https://docs.docker.com/install/linux/docker-ce/centos/

Step 3:

Check the Docker Logging Driver configuration:

docker info | grep “Logging Driver”

If it is set to Logging Driver: journald, you must set it to “json-file” instead. To do that:

  1. Open the docker file for editing:
vi /etc/sysconfig/docker
  • Edit the following part of the file so that it looks like below (showing log-driver=json-file):
# Modify these options if you want to change the way the docker daemon runs
OPTIONS='--selinux-enabled --log-driver=json-file --signature-verification=false'
  • Restart Docker:

systemctl restart docker

systemctl status docker

Step 4:

Install Cloudbreak on a VM:

yum -y install unzip tar

curl -Ls public-repo-1.hortonworks.com/HDP/cloudbreak/cloudbreak-deployer_2.8.0_$(uname)_x86_64.tgz | sudo tar -xz -C /bin cbd

cbd –version

mkdir cloudbreak-deployment

cd cloudbreak-deployment

In the directory, create a file called Profile with the following content:

export UAA_DEFAULT_SECRET=MY-SECRET

export UAA_DEFAULT_USER_PW=MY-PASSWORD

export UAA_DEFAULT_USER_EMAIL=MY-EMAIL

export PUBLIC_IP=MY_VM_IP

For example

export UAA_DEFAULT_SECRET=GravitySecret123

export UAA_DEFAULT_USER_PW=GravitySecurePassword123

export UAA_DEFAULT_USER_EMAIL=peter.smith@family.com

export PUBLIC_IP=172.22.222.222

Generate configurations by executing

rm *.yml

cbd generate

The cbd start command includes the cbd generate command which applies the following steps:

  • Creates the docker-compose.yml file, which describes the configuration of all the Docker containers required for the Cloudbreak deployment.
  • Creates the uaa.yml file, which holds the configuration of the identity server used to authenticate users with Cloudbreak.

Start the Cloudbreak application by using the following commands

cbd pull-parallel

cbd start

check Cloudbreak application logs

cbd logs cloudbreak

If you observe any failure then install jq

wget -O jq https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64

chmod +x ./jq

cp jq /usr/bin

You should see a message like this in the log: Started CloudbreakApplication in 36.823 seconds.

Step 5:

Log in to the Cloudbreak UI using https://ip_address or

use

cbd start

To obtain the login information

Log in to the Cloudbreak web UI using the credentials that you configured in your Profile file:

  • The username is the UAA_DEFAULT_USER_EMAIL
  • The password is the UAA_DEFAULT_USER_PW

Creating a NiFi cluster:

Refer to url : https://community.cloudera.com/t5/Community-Articles/Using-Cloudbreak-to-create-a-Flow-Management-NiFi-cluster-on/ta-p/249132

Step 1:

Here we shall be creating a Nifi cluster using a custom blueprint. If you try to create your own nifi cluster it fails.

Click on the CREATE BLUEPRINT button. You should see the Create Blueprint screen.

Step 2:

Click on the Upload JSON File button and upload the blueprint attached herewith

{
  "Blueprints": {
    "blueprint_name": "hdf-nifi-no-kerberos",
    "stack_name": "HDF",
    "stack_version": "3.2"
  },
  "configurations": [
    {
      "nifi-ambari-config": {
        "nifi.security.encrypt.configuration.password": "changethisplease",
        "nifi.max_mem": "1g",
        "nifi.sensitive.props.key": "changethisplease"
      }
    },
    {
      "nifi-bootstrap-env": {
        "content": "#\n# Licensed to the Apache Software Foundation (ASF) under one or more\n# contributor license agreements. See the NOTICE file distributed with\n# this work for additional information regarding copyright ownership.\n# The ASF licenses this file to You under the Apache License, Version 2.0\n# (the \"License\"); you may not use this file except in compliance with\n# the License. You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n#\n\n# Java command to use when running NiFi\njava=java\n\n# Username to use when running NiFi. This value will be ignored on Windows.\nrun.as={{nifi_user}}\n##run.as=root\n\n# Configure where NiFi's lib and conf directories live\nlib.dir={{nifi_install_dir}}/lib\nconf.dir={{nifi_config_dir}}\n\n# How long to wait after telling NiFi to shutdown before explicitly killing the Process\ngraceful.shutdown.seconds=20\n\n{% if security_enabled %}\njava.arg.0=-Djava.security.auth.login.config={{nifi_jaas_conf}}\n{% endif %}\n\n# Disable JSR 199 so that we can use JSP's without running a JDK\njava.arg.1=-Dorg.apache.jasper.compiler.disablejsr199=true\n\n# JVM memory settings\njava.arg.2=-Xms{{nifi_initial_mem}}\njava.arg.3=-Xmx{{nifi_max_mem}}\n\n# Enable Remote Debugging\n#java.arg.debug=-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=8000\n\njava.arg.4=-Djava.net.preferIPv4Stack=true\n\n# allowRestrictedHeaders is required for Cluster/Node communications to work properly\njava.arg.5=-Dsun.net.http.allowRestrictedHeaders=true\njava.arg.6=-Djava.protocol.handler.pkgs=sun.net.www.protocol\n\n# The G1GC is still considered experimental but has proven to be very advantageous in providing great\n# performance without significant \"stop-the-world\" delays.\njava.arg.13=-XX:+UseG1GC\n\n#Set headless mode by default\njava.arg.14=-Djava.awt.headless=true\n\n#Ambari Metrics Collector URL - passed in to flow.xml for AmbariReportingTask\njava.arg.15=-Dambari.metrics.collector.url=http://{{metrics_collector_host}}:{{metrics_collector_port}}/ws/v1/timeline/metrics\n\n#Application ID - used in flow.xml - passed into flow.xml for AmbariReportingTask\njava.arg.16=-Dambari.application.id=nifi\n\n#Sets the provider of SecureRandom to /dev/urandom to prevent blocking on VMs\njava.arg.17=-Djava.security.egd=file:/dev/urandom\n\n# Requires JAAS to use only the provided JAAS configuration to authenticate a Subject, without using any \"fallback\" methods (such as prompting for username/password)\n# Please see https://docs.oracle.com/javase/8/docs/technotes/guides/security/jgss/single-signon.html, section \"EXCEPTIONS TO THE MODEL\"\njava.arg.18=-Djavax.security.auth.useSubjectCredsOnly=true\n\n###\n# Notification Services for notifying interested parties when NiFi is stopped, started, dies\n###\n\n# XML File that contains the definitions of the notification services\nnotification.services.file={{nifi_config_dir}}/bootstrap-notification-services.xml\n\n# In the case that we are unable to send a notification for an event, how many times should we retry?\nnotification.max.attempts=5\n\n# Comma-separated list of identifiers that are present in the notification.services.file; which services should be used to notify when NiFi is started?\n#nifi.start.notification.services=email-notification\n\n# Comma-separated list of identifiers that are present in the notification.services.file; which services should be used to notify when NiFi is stopped?\n#nifi.stop.notification.services=email-notification\n\n# Comma-separated list of identifiers that are present in the notification.services.file; which services should be used to notify when NiFi dies?\n#nifi.dead.notification.services=email-notification\n\njava.arg.tmpdir=-Djava.io.tmpdir=/usr/hdf/current/nifi/lib"
      }
    },
    {
      "nifi-properties": {
		"nifi.sensitive.props.key": "changemeplease",
		"nifi.security.identity.mapping.pattern.kerb": "^(.*?)@(.*?)$",
		"nifi.security.identity.mapping.value.kerb": "$1",
		"nifi.security.user.login.identity.provider": ""
      }
    },
    {
      "nifi-ambari-ssl-config": {
		"nifi.toolkit.tls.token": "changemeplease",
		"nifi.node.ssl.isenabled": "false",
		"nifi.toolkit.dn.prefix": "CN=",
		"nifi.toolkit.dn.suffix": ", OU=NIFI"
      }
    },
    {
      "nifi-registry-ambari-config": {
        "nifi.registry.security.encrypt.configuration.password": "changethisplease"
      }
    },
    {
      "nifi-registry-properties": {
        "nifi.registry.security.identity.mapping.pattern.kerb": "^(.*?)@(.*?)$",
        "nifi.registry.security.identity.mapping.value.kerb": "$1",
        "nifi.registry.db.password": "changethisplease"
      }
    },
    {
      "nifi-registry-ambari-ssl-config": {
		"nifi.registry.ssl.isenabled": "false",
		"nifi.registry.toolkit.dn.prefix": "CN=",
		"nifi.registry.toolkit.dn.suffix": ", OU=NIFI"
      }
    }
  ],
  "host_groups": [
    {
      "name": "Services",
      "components": [
        {
          "name": "NIFI_REGISTRY_MASTER"
        },
        {
          "name": "METRICS_COLLECTOR"
        },
        {
          "name": "METRICS_MONITOR"
        },
        {
          "name": "METRICS_GRAFANA"
        },
        {
          "name": "ZOOKEEPER_CLIENT"
        }
      ],
      "cardinality": "1"
    },
    {
      "name": "NiFi",
      "components": [
        {
          "name": "NIFI_MASTER"
        },
        {
          "name": "METRICS_MONITOR"
        },
        {
          "name": "ZOOKEEPER_CLIENT"
        }
      ],
      "cardinality": "1+"
    },
    {
      "name": "ZooKeeper",
      "components": [
        {
          "name": "ZOOKEEPER_SERVER"
        },
        {
          "name": "METRICS_MONITOR"
        },
        {
          "name": "ZOOKEEPER_CLIENT"
        }
      ],
      "cardinality": "3+"
    }
  ]
}

In the left menu, click on Clusters. Cloudbreak will display configured clusters. Click the CREATE CLUSTER button. Cloudbreak will display the Create Cluster wizard

Step 3:

To create a credential in Azure follow the following steps

We shall be creating an interactive credential

  • In the Cloudbreak web UI, select Credentials from the navigation pane.
  • Click Create Credential.
  • Under Cloud provider, select “Microsoft Azure”.
  • Select Interactive Login.
  • Provide the following information:
Parameter Description
Name Enter a name for your credential.
Description (Optional) Enter a description.
Subscription Id Copy and paste the Subscription ID from your Subscriptions.
Tenant Id Copy and paste your Directory ID from your Active Directory > Properties.
Azure role type You have the following options: “Use existing Contributor role” (default)

While creating credentials from Azure for cloudbreak get the tenant details from Azure Active Directory – > Properties -> Directory ID

Incase of errors refer to the url : https://docs.cloudera.com/HDPDocuments/Cloudbreak/Cloudbreak-2.9.0/create-credential-azure/content/cb_create-interactive-credential.html

Step 4:

General Configuration

By default, the General Configuration screen is displayed using the BASIC view.

  • Credential: Select the Azure credential you created in the above step.
  • Cluster Name: Enter a name for your cluster. The name must be between 5 and 40 characters, must start with a letter, and must only include lowercase letters, numbers, and hyphens.
  • Region: Select the region in which you would like to launch your cluster.
  • Platform Version: Cloudbreak currently defaults to HDP 2.6. Select the dropdown arrow and select HDF 3.1.
  • Cluster Type: As mentioned previously, there are two supported cluster types. Make sure select the blueprint you just created.

Click the green NEXT button.

Step 5:

Hardware and Storage

Cloudbreak will display the Hardware and Storage screen. On this screen, you have the ability to change the instance types, attached storage and where the Ambari server will be installed. As you you can see, we will deploy 1 NiFi and 1 Zookeeper node. In a production environment you would typically have at least 3 Zookeeper nodes. We will use the defaults.

Click the green NEXT button.

Step 6:

Gateway Configuration

Cloudbreak will display the Gateway Configuration screen. On this screen, you have the ability to enable a protected gateway. This gateway uses Knox to provide a secure access point for the cluster. We will leave this option disabled.

Click the green NEXT button.

Step 7:

Network

Cloudbreak will display the Network screen. On this screen, you have the ability to specify the NetworkSubnet, and Security Groups. Cloudbreak defaults to creating new configurations. We will use the default options to create new configurations.

Because we are using a custom blueprint which disables SSL, we need to update the security groups with correct ports for the NiFi and NiFi Registry UIs. In the SERVICES security group, add the port 61080 with TCP. Click the + button to add the rule. In the NIFI security group, add the port 9090 with TCP. Click the + button to add the rule.

You should see something similar the following:

Click the green NEXT button.

Step 8:

Security

Cloudbreak will display the Security screen. On this screen, you have the ability to specify the Ambari admin username and password. You can create a new SSH key or selecting an existing one.

ssh-keygen -t rsa

Copy the public key generated from command above to the field in the screenshot below

You have the ability to display a JSON version of the blueprint. You also have the ability display a JSON version of the cluster definition. Both of these can be used with Cloudbreak CLI to programatically automate these operations.

Click the green CREATE CLUSTER button.

Step 9:

Cluster Summary

Cloudbreak will display the Cluster Summary page. It will generally take between 10-15 minutes for the cluster to be fully deployed. As you can see, this screen looks similar to and HDP cluster. The big difference is the Blueprint and HDF Version.

Click on the Ambari URL to open the Ambari UI.

Step 10:

Ambari

You will likely see a browser warning when you first open the Ambari UI. That is because we are using self-signed certificates.

Click on the ADVANCED button. Then click the link to Proceed.

You will be presented with the Ambari login page. You will login using the username and password you specified when you created the cluster.

You should see the cluster summary screen. As you can see, we have a cluster with Zookeeper, NiFi, and the NiFi Registry.

Click on the NiFi service in the left hand menu. Now you can access the Quick Links menu for a shortcut to the NiFi UI.

You should see the NiFi UI.

Back in the Ambari UI, click on the NiFi Registry service in the left hand menu. Now you can access the Quick Links menu for a shortcut to the NiFi Registry UI.

You should see the NiFi Registry UI.

Creating a Kafka cluster:

Step 1:

Follow the instructions as above on the HortonWorks UI (similar to NiFi cluster) for creating Kafka and you should be able to create it easily. Unlike NiFi it does not necessarily require a custom template to get created.

Creating a Cassandra Cluster:

Step 1:

Spawn 2 CentOS 7 machines in Azure cloud

Step 2:

Install Cassandra

vi /etc/yum.repos.d/datastax.repo

[datastax]

name = DataStax Repo for Apache Cassandra

baseurl = http://rpm.datastax.com/community

enabled = 1

gpgcheck = 0

yum install dsc30

yum install cassandra30-tools

Step 3:

Open the following ports in Azure for inbound traffic

7000

7001

7199

9042

9160

9142

Also open them in centos using the following commands for each of the ports

# firewall-cmd –zone=public –add-port=7000/tcp –permanent

# firewall-cmd –reload

# iptables-save | grep 7000

Step 4:

modify the following entry in /etc/cassandra/conf/cassandra-env.sh on both the machines

JVM_OPTS=”$JVM_OPTS -Djava.rmi.server.hostname=martians.eastus.cloudapp.azure.com”

Step 5:

modify the following entries in /etc/cassandra/conf/cassandra.yaml on both the machines

– seeds: “martians.eastus.cloudapp.azure.com,martiansext.eastus.cloudapp.azure.com”

On machine 1 only

listen_address: martians.eastus.cloudapp.azure.com

rpc_address: 0.0.0.0

rpc_port: 9160

start_rpc: true

broadcast_rpc_address: martians.eastus.cloudapp.azure.com

On machine 2 only

listen_address: martiansext.eastus.cloudapp.azure.com

rpc_address: 0.0.0.0

rpc_port: 9160

start_rpc: true

broadcast_rpc_address: martiansext.eastus.cloudapp.azure.com

Step 6:

modify /etc/hosts file to add the following entries

10.8.8.8 martians.eastus.cloudapp.azure.com (private ips)

10.9.9.9 martiansext.eastus.cloudapp.azure.com

13.7.7.7 martians.eastus.cloudapp.azure.com (public ips)

40.6.66.66 martiansext.eastus.cloudapp.azure.com

Step 7:

systemctl restart Cassandra

nodetool status

Datacenter: datacenter1

=======================

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

—  Address   Load       Tokens       Owns (effective)  Host ID                               Rack

UN  10.8.8.8  263.24 KB  256          100.0%            943b9686-ec28-49ed-a8f8-5223d88df8b1  rack1

UN  10.9.9.9  287.54 KB  256          100.0%            29a37841-5af4-47cd-9065-5a1c953274f1  rack1

Step 8:

cqlsh martians.eastus.cloudapp.azure.com

CREATE TABLE Device.data (      

Serial_number text,

dimension int,

param_reading1 text, 

param_reading2 text, 

param_readingN text, 

PRIMARY KEY (Serial_number)

 );

INSERT INTO  Device.data (Serial_number, dimension, param_reading1, param_reading2, param_readingN) VALUES(‘Boeing737’, 45, ‘LCM-100’, ‘HCM-200’, ‘TFT-400’);

run cqlsh martiansext.eastus.cloudapp.azure.com

 select * from Device.data;

 You should see the record that has been added

Creating a Spark Cluster:

Step 1:

spawn 3 CentOS based linux vms in Azure cloud. 1 of these will serve as master and the other 2 as slaves.

gravitymaster.eastus.cloudapp.azure.com

52.888.88.888 (public ip)

10.9.9.9 (private ip)

gravityslave1.eastus.cloudapp.azure.com

52.777.777.777

10.6.6.6

gravityslave2.eastus.cloudapp.azure.com

13.55.555.555

10.4.4.4

ensure that you use the same user id on all the 3 machines. Say gravitymaster

Step 2:

Add the following 3 entries to the /etc/hosts file of gravitymaster.

52.888.88.888 gravitymaster.eastus.cloudapp.azure.com

52.777.777.777 gravityslave1.eastus.cloudapp.azure.com

13.55.555.555 gravityslave2.eastus.cloudapp.azure.com

Step 3:

install java on master and slave machines using.

wget -c –header “Cookie: oraclelicense=accept-securebackup-cookie” http://download.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.rpm

yum localinstall jdk-8u131-linux-x64.rpm

sudo alternatives –config java

sudo sh -c “echo export JAVA_HOME=/usr/java/jdk1.8.0_131/jre  >> /etc/environment”

java -version

Step 4:

install scala on master and slave machines using.

wget https://downloads.lightbend.com/scala/2.13.0/scala-2.13.0.rpm

yum install scala-2.13.0.rpm

scala -version

Step 5:

Configure passwordless ssh from master to the slave machines.

[gravitymaster@gravitymaster ~]$ ssh-keygen -t rsa

Press enter wherever prompted

Execute the following commands

ssh gravitymaster@gravityslave1.eastus.cloudapp.azure.com mkdir -p .ssh

cat .ssh/id_rsa.pub | ssh gravitymaster@gravityslave1.eastus.cloudapp.azure.com ‘cat >> .ssh/authorized_keys’

ssh gravitymaster@gravityslave1.eastus.cloudapp.azure.com “chmod 700 .ssh; chmod 640 .ssh/authorized_keys”

ssh gravitymaster@gravityslave1.eastus.cloudapp.azure.com

exit

ssh gravitymaster@gravityslave2.eastus.cloudapp.azure.com mkdir -p .ssh

cat .ssh/id_rsa.pub | ssh gravitymaster@gravityslave2.eastus.cloudapp.azure.com ‘cat >> .ssh/authorized_keys’

ssh gravitymaster@gravityslave2.eastus.cloudapp.azure.com “chmod 700 .ssh; chmod 640 .ssh/authorized_keys”

ssh gravitymaster@gravityslave2.eastus.cloudapp.azure.com

Step 6:

install spark on all 3 VMs using.

wget https://www.apache.org/dyn/closer.lua/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz

$ tar xvf spark-2.4.4-bin-hadoop2.7.tgz

mv spark-2.4.4-bin-hadoop2.7 /usr/local/spark

vi ~/.bashrc

Add the following line to ~/.bashrc file.

export PATH = $PATH:/usr/local/spark/bin

$ source ~/.bashrc

Step 7:

Make the following changes to the master server.

cd /usr/local/spark/conf

cp spark-env.sh.template spark-env.sh

vi spark-env.sh

export SPARK_MASTER_HOST='<MASTER-IP>’ (make sure this is the private ip of the azure server)

export JAVA_HOME=/usr/java/jdk1.8.0_131/jre

Step 8:

edit the configuration of slaves on the master machine vi /usr/local/spark/conf.

gravitymaster.eastus.cloudapp.azure.com

gravityslave1.eastus.cloudapp.azure.com

gravityslave2.eastus.cloudapp.azure.com

Step 9:

start the spark cluster, using the following command on master.

$ cd /usr/local/spark

$ ./sbin/start-all.sh

Step 10:

Access the spark web ui using.

http://gravitymaster.eastus.cloudapp.azure.com:8080/

Incase of issue check the master and slave logs located at

/usr/local/spark/logs/spark-gravitymaster-org.apache.spark.deploy.master.Master-1-gravitymaster.eastus.cloudapp.azure.com.out

/usr/local/spark/logs/spark-gravitymaster-org.apache.spark.deploy.worker.Worker-1-gravityslave1.out

/usr/local/spark/logs/spark-gravitymaster-org.apache.spark.deploy.worker.Worker-1-gravityslave2.out

Step 11:

For stopping the above processes.

$ cd /usr/local/spark
$ ./sbin/stop-all.sh

Or check the process ids from the logfiles in the previous step and

Kill -9 <processid>

101 Challenges of any offshore Information Technology Service Provider

Information technology service provider companies go through several challenges. Herein I have captured just a 101 of the several challenges they have gone through over the years. Kindly read through to understand their pain below.

1. Customer

1.1 Proposal & Bidding Stage

1. How do you arrive at a good effort and cost estimate for a proposal based upon the minimal information provided at times in an RFP (Request for Proposal) by the customer.

2. How do you justify the effort estimate in a proposal to the customer given the fact that 20% of the technical project tasks consume 70% of the project effort. This is due to the fact that if a developer(s) were to be stuck on a technical defect or missing feature it could take days to resolve it or implement a workaround. The delay could be more in case of third party and off the shelf tools.

3. How do you arrive at the optimal cost of a proposal when there are various internal groups, teams, verticals, horizontals and other sub vendors demanding a larger size of the pie (customer business)

4. How do you convince the customer to go with a Fix bid vs Time and Material model for a project based upon what is less riskier for your organization.

5. How do you make a decision on whether to propose a custom developed application vs a customized application using an off the shelf product from the marketplace.

1.2 Engagement Kickoff stage

6. How do you engage the right developers both internally and externally by interviewing them and reaching out for references.

7. How do you organize the right candidates to travel overseas to customer site for the scoping and business discussions phase given the fact that the candidates who are right for the task might not have the visas and the candidates who have the visas might not be available for travel or yet to be deboarded from other projects, engagements or practices.

8. How do you setup devops pipelines and code quality gates upfront before the start of project and still meet the schedule.

9. How do you justify on-boarding a full-time project manager or full time technical architect or a full-time security architect or full-time devops architect on a frugal customer budget. And if they are on-boarded part-time then how do you ensure that they do not end-up spending 100% of their time on the project which unfortunately is a frequently observed scenario.

10. How do you ensure that the background check of candidates is completed in time before on-boarding them into projects.

11. How do you deal with a delay in providing functional and technical inputs from the customer.

12. How do you ensure that there is workspace, connectivity, servers and other infrastructure available for a project team to hit the ground running at-least a couple of weeks in advance before a bid is awarded and project kicked off.

1.3 Peak Development Phase

13.How do you ensure that project tasks are divided logically and are evenly distributed among all the developers so that each developer is equally loaded and a few developers are not burned down while the others try to create tasks for themselves.

14. How do you deal with code and design changes trickling down due to requirement and functional changes from Business and most importantly differentiate a change from a misinterpretation of requirements.

15. How do you ensure that the technical leads , architects and developers who might be loaded with design and development tasks are doing a good job at reviewing the code and documents.

16. How do you ensure that the project team creates time to incorporate the feedback and review comments provided on the project deliverables.

17. How do you ensure that the customer SME calendars are blocked well in advance for reviews and inputs and any changes originating from those reviews are incorporated as design and developmental changes without slippage in effort, schedule or cost.

18. How do you deal with customer escalations on quality of deliverables , infrastructure or human resources.

19. How do you ensure that candidates who are deboarded from projects temporarily due to customer budget constraints are kept engaged or billable in other projects so as not to drag down the revenue of the group/org as a whole.

20. How do you deal with projects that have been long running at a low billing rate.

21. How do you ensure that the data that has grown in the back-end exponentially over a long period of time in a system/application is migrated from a RDBMS to a NoSql DB or from on-premise to cloud without causing any major disruption.

22. How do you ensure that an over-customized off-the-shelf tool/product used in an application is migrated from one version to the next without causing a load of regression defects.

23. How do you ensure that you do not over commit and under-deliver on requirements both functional and performance.

24. How do you equip the existing development team to handle the increased work pressure due to frequent separation of co-developers from the organization.

25. How do you deal with a team from other vendors or competing vendor organization who have been handed over a piece of the pie or project and have been too eager to escalate any delay in dependent deliverables to the customer.

26. How do you ensure that development team members get a chance to attend seminars , meetings and training which might be beneficial to the organization and help further their personal career but may or may not be directly related to the project.

27. How do you ensure that the development team gets the required inputs incase they are stuck while the customer organization is on long holidays.

28. How do you manage the deliverables when the development team has requested for a holiday due to major and long local festivals or personal time-off while the customer organization is working.

29. How do you manage permanently or temporarily “work from home” employees and get the required output from them.

30. How do you manage remotely working employees in a different city or from coast to coast ,keep their travel expenses under control and get the required output from them.

31. How do you get help for developers in debugging a problem at the right time internally from with the organization or from external consultants so that the project schedule is under control.

32. How do you deal with the organizational culture of hitherto unexplored countries and geographies.

33. How do you ensure that the onsite team has enough inputs at the beginning of their day before you windup and leave for the day at offshore and vice-versa.

34. How do you ensure that project documentation is short , crisp, visual and does not take up more time of the project team. Can be easily updated in future.

35. How do you ensure that code is properly version controlled in-order to avoid accidental overwriting and other issues.

36. How do you ensure that maximum parallelism is achieved in development by ensuring that independent teams mock the data that is planned to be available at a later point in time from a peer team.

37. How do you ensure that a delay in deliverables for a particular sprint is communicated well in advance to the customer instead of giving a heads up at the last moment.

1.4 Testing

38.How do you keep the testing team well engaged during the process of delay in development phase deliverables.

39. How do you ensure that the customer does not spot new defects that were not detected by the internal testing team.

40. How do you ensure that the testing team is geared up for full automation of testing.

41. How do you ensure that any production data is deidentified or massaged before being used in a preprod testing environment.

42. How do you ensure that production application users do not receive unwanted / spam emails from application.

1.5 GO Live Phase

43. How do you ensure that the customer business is not disrupted and the customer’s customers are not dissatisfied during the GO Live phase while transitioning from one system to another.

44. How do you ensure that production issues get utmost attention they deserve and are fixed in time to maintain customer delight.

45. How do you ensure that downtime for customer’s customers is minimized during the go live phase.

46. How do you ensure that downtimes and upcoming features are proactively and in-detail communicated to customer’s customers.

47. How do you ensure that a go live checklist is complete, comprehensive and can be rolled back incase of failure.

1.6 Maintenance and Production Support Phase

48. How do you keep employee morale high on maintenance and production support projects when they might be required to do seemingly uninteresting bug fixes and may even get called at odd times to fix production jobs/processes.

49. How do you maintain a low bench strength in between projects.

50. How and whether to proactively hire candidates in expectation of large upcoming deals in pipeline and also ensure that you are not saddled with a huge bench incase the expected deals fail to materialize.

2. Foreign Govt

51. How do you comply to foreign govt requirements regarding hiring of local talent to staff projects in-spite of continuously conducting interviews and being unable to find the right or adequate number of candidates and yet be able to meet customer specified schedule.

52. How do you ensure that you do not create a perception of replacing locals with migratory workers.

3. Sales

53. How do you ensure that your organization still stays in the race against competition in the bidding process when you know that the cost of your proposal is much higher than your competitor. Reduce it and you risk burning down the development team later in the project and make the whole engagement unprofitable, Raise it and you risk loosing the bid.

54. How do you reach-out to and coordinate with various internal teams and ensure that the presentations, source code, hardware, models and personnel required to speak for demo are all organized at a short notice for showcasing the relevant capabilities , skills and experience to the customer

55. How do you look for the right leads and opportunities into both an existing business and new businesses of existing as well as new customers.

56. How do you continuously watch for market updates on a line of business or technology and also keep tabs on competitor products or tools in-order to continuously improvise on the solutions delivered to an existing customer.

57. How do you deal with the sudden departure of a top management executive from customer organization who has been till date very supportive of outsourcing to a particular vendor organization.

58. How do you handle changing times where customers want to see unique IP or accelerator frameworks developed by organizations before awarding the contract rather than just make a decision based upon billing rates.

59. How do you balance your business across various continents and geographies in the world so that your risk is spread incase of an economic downturn in a particular geography.

4. IT Operations

60. How do you ensure that there is no slippage in schedule on an engagement due to the time taken for setting up hardware and software infrastructure.

61. How do you get the necessary approvals and deviations required to access special sites necessary for development when there are strong customer data protection laws in place.

62. How do you deal with infrastructure downtimes and sudden outages which are due to telecom service providers and not under the control of internal IT operations.

63. How do you ensure that the work environment is secured from viruses malware and fishing attacks which could lead to a major downtime in productivity.

5. Quality

64. How do you bake in the time for internal quality processes like Reviews, Root Cause Analysis, Traceability and Audits into project effort estimates and justify the same to customers who are looking for the best deal in the market.

65. How do you justify the need for vendor internal quality processes , documentation and personnel when the customer is more comfortable with their internal quality team.

66. How do you ensure that there are proper disaster recovery, backup and restoration procedures in place incase of a crash.

67. How do you ensure that there is proper knowledge transition when a product or application is handed over from one team to another or from one organization to another.

6. Human Resources & Recruitment

68. How do you maintain the morale of Candidates who are looking for a change in project / assignment / work location

69. How do you retain candidates who are unsatisfied with their current Salary or Designation

70. How do you deal with employee fatigue and delays in reaching workplace due to infrastructure problems in city.

71. How do you deal with employees who refuse to work from client locations.

72. How do you organize the right training programs to upskill and cross skill the employees on various projects.

73. How do you deal with poor performers and ensure that they are either retrained or moved out of a project in time so as to not drag down the performance and rating of entire engagement.

74. How do you encourage more patent , IP and research papers creation within the organization. 

75. How do you ensure that there are proper processes and avenues available for individual employees, and teams to share the knowledge gained across projects in-order to avoid duplication and rework

76. How do you ensure that there is a right mix of fresh graduates from university vs lateral recruits in the hiring process.

77. How do you ensure that positive and right messages are sent out by current and past employees on employer review portals like glassdoor.

78. How do you ensure that job offers of campus recruits are honored in case of an economic downturn or atleast proper adequate arrangements are made for them to pursue an alternate opportunity.

79. How do you ensure that employees being deputed overseas are briefied adequately so that they carry themselves graciously and act as ambassadors of their native country.

7. Upper Management

80. How do you search, hunt, create , demonstrate, grow the business at the tall target growth rate set by CEO and at the same-time ensure that the delivery on existing projects and engagements also gets adequate attention it deserves.

81. How do you correct the customer perception on the group or organization as a whole carried over from a previous slippage in delivery.

82. How do you handle the various personalities and characteristics of stakeholders in the customer organization. One who is always on the verge of escalating every other issue, another who is very supportive to a competing vendor organization and yet another who supports you and speaks on your behalf.

83. How do you keep the team motivated with outings, excursions,luncheons and other team building activities.

84. How do you decide on what percentage of your profits to giveaway towards charity and other corporate social responsibilities.

85. How do you organize the various technology leaning horizontals and the domain leaning verticals such that they do not present an incoherent picture of the organization to the customer and there is a unified message being delivered to customers.They do not compete with each other for customer business instead compete with each other healthily in delivering value to the customer.

86. How do you shift the focus of an organization steeped in the usual programming and development business using technologies of the previous decades to one that can consider itself to be in the forefront of digital transformation with AI Machine learning and Deep learning.

7. Finance

87. How do you invest and maintain cloud infrastructure for new R&D projects when Finance keeps insisting on demonstrating the RoI

88. How do you kickoff projects on-time and ensure that the schedule is not affected due to incorrect and delayed Purchase Orders and invoices to and from the customer, when there are protocols laid down by finance to not start projects without a clear Purchase Order

8. Legal

89. How do you deal with new govt laid data privacy laws and laws that vary from country to country.

90. How do you deal with HIPAA, GDPR and similar compliance issues.

91. How do you keep the Total Cost of a Project to the customer low by choosing between opensource and enterprise tools

92. How do you ensure that any untoward breach in privacy of data or process is handled swiftly and sternly causing little to no damage to the organization’s reputation.

93. How do you handle individual and class action lawsuits from departing employees in any country of business.

9. Home Govt

94. How do you run a business in a high tax environment and in a place with strict land acquisition and labour laws.

95. How do you speed up approvals for new investments

96. How do maintain continuity of work in the face of strikes and shutdowns in a city

10. Job Market

97. How do you find candidates with the right skills and recruit genuine ones from among those million resumes with several claiming to have “set foot on the moon”.

98. How do you deal with candidates who have a long notice period and bail out at the last moment on your offer.

99. How do you deal with candidates who accept a job offer just to demonstrate it to a competitor for a better one.

100. How do you hire for a position when the upper limit of the gross salary on offer is lower than the median offered by large multinationals

101. How do you prepare an accurate job description so that the right candidates apply for the position and the shortlisting process is easier.