DevOps : GIT Interview Questions and Answers

1. What is GIT ?
Answer: GIT is a broadcasted adaptation administration structure and source code administration (SCM) frame including an accentuation to trade with small and valuable businesses by activity and proficiency.

2. What is a repository in GIT?

Answer: An archive contains a registry named git, where git keeps the greater part of its metadata for the storehouse. The substance of the git registry is private to git.

3. How can we know if a branch is already merged into master in GIT?

Answer: : We can utilize following orders for this reason:
  • git branch – blended ace: This prints the branches converged into an ace
  • git branch – blended records: This prints the branches converged into HEAD (i.e. the tip of the current branch)
  • git branch – no-consolidated: This prints the branches that have not been blended
              As a matter, of course, this applies just to nearby offices.
We can utilize – a banner to demonstrate both nearby and remote branches.Or on the other hand, we can utilize – r banner to indicate just the remote branches.

4. What is the purpose of git stash drop?

Answer: On the off chance that we needn’t bother with a particular reserve, we utilize git stash drop charge to expel it from the rundown of supplies.
As a matter of course, this summon expels to most recent included reserve
To evaluate a particular reserve we indicate as a contention in the git stash drop charge.
5. What is the HEAD in GIT?

Answer: AHEAD is a reference to the present looked at conferring.
It is a representative reference to the branch that we have looked at.
At any given time, one head is chosen as the ‘present head’ this head is otherwise called HEAD (dependably in capitalized).
6. What is the most popular branching strategy in GIT?

Answer: There are numerous approaches to do stretching in GIT.One of the well-known routes is to keep up two branches:
ace: This branch is utilized for a generation. In this branch HEAD is a dependably underway prepared state.
build up: This branch is utilized for improvement. In this branch, we store the most recent code created in a venture.

This is work in advance code.Once the code is prepared for sending to creation, it is converted into the ace branch from creating a branch.
7. What is SubGit?

Answer: SubGit is programming apparatus utilized for relocating SVN to Git. It is anything but difficult to utilize. By utilizing this we can make a writable Git reflection of a Subversion store.
It makes a bi-directional mirror that can be utilized for pushing to Git and in addition focusing on Subversion.
SubGit additionally deals with synchronization amongst Git and Subversion.

8. What is the use of git instaweb?

Answer: Git-instaweb is a content by which we can peruse a git archive in a web program. It sets up the gitweb and a web-server that makes the working vault accessible on the web.

9. What are git hooks?
Answer: Git snares are contents that can run consequently in the event of an occasion in a Git store. These are utilized for robotization of the work process in GIT.
Git snares likewise help in altering the inward conduct of GIT.
These are for the most part utilized for implementing a GIT confer arrangement.
10. What are the main benefits of GIT?

Answer: There are following primary advantages of GIT:
 Distributed System: GIT is a Distributed Version Control System (DVCS). So you can keep your private work in adaptation control yet totally escaped others. You can work disconnected too.
 Flexible Workflow:GIT enables you to make your own work process. You can utilize the procedure that is appropriate for your venture. You can go for brought together or ace slave or some other work process.
 Fast: GIT is quick when contrasted with other form control frameworks.
 Data Integrity: Since GIT utilizes SHA1, information isn’t less demanding to degenerate.
 Free: It is free for individual utilize. Such huge numbers of beginners utilize it for their underlying activities. It likewise works exceptionally well with substantial size task.
 Collaboration: GIT is anything but difficult to use for ventures in which joint effort is required. Numerous prevalent open source programming over the globe utilize GIT.

11. What are the disadvantages of GIT?

Answer: GIT has not very many weaknesses. These are the situations when GIT is hard to utilize. Some of these are:
Binary Files: If we have a considerable measure double records (non-content) in our venture, at that point GIT turns out to be moderate. E.g. Tasks with a lot of pictures or Word records.
Steep Learning Curve: It sets aside some time for a newcomer to learn GIT. A portion of the GIT summons is non-instinctive to a fresher.
Slow remote speed: Sometimes the utilization of remote stores in ease back because of system dormancy. Still, GIT is superior to different VCS in speed. 

12. How will you start GIT for your project?

Answer:We utilize git init order in a current venture catalog to begin form control for our undertaking. After this, we can utilize git add and git confer orders to add records to our GIT archive.

13. What is git clone in GIT?
Answer: In GIT, we utilize git clone summon to make a duplicate of a current GIT archive in our nearby.
This is the most prevalent approach to make a duplicate of the archive among designers.
It is like svn checkout. In any case, for this situation, the working duplicate is an undeniable archive.

14. How will you create a repository in GIT?

Answer: To make another archive in GIT, first, we make an index for the venture. At that point, we run ‘git init’ charge. Presently, GIT makes the .git index in our venture catalog. This is the manner by which our new GIT store is made.

15. What are the different ways to start work in GIT?
Answer: We can begin work on GIT in following ways:
New Project: To make another storehouse we utilize git init order.
Existing Project: To chip away at a current storehouse we utilize git clone order.

16. GIT is written in which language?
Answer: Most of the GIT circulations are composed in C dialect with Bourne shell. A portion of the charges is composed in Perl dialect. 

17. What does ‘git pull’ command in GIT do internally?
Answer: In GIT, git pull inside completes a git get first and after that completes a git blend.
So, pull is a blend of two orders: bring and combine.
We utilize git pull order to convey our neighborhood office fully informed regarding its remote adaptation.
18. What is git stash?

Answer: In GIT, now and again we would prefer not to submit our code yet we would prefer not to lose additionally the incomplete code. For this situation we utilize git stash summon to record the present condition of the working registry and list in a reserve. This stores the incomplete work in a reserve and cleans the present branch from uncommitted changes.
Presently we can chip away at a perfect working index.
Later we can utilize the reserve and apply those progressions back to our working index.
On occasion we are amidst some work and would prefer not to lose the incomplete work, we utilize git stash order.
19. What is the meaning of ‘stage’ in GIT?

Answer: In GIT, arrange is a stage before confer. To arrange implies that the records are prepared for submission.
Let say, you are dealing with two highlights in GIT. One of the highlights is done and the other isn’t yet prepared. You need to confer and leave for home at night. Yet, you can confer. since the two are not completely prepared. For this situation, you can simply organize the element that is prepared and confer that part. The second element will stay as work in advance.

20. What is the purpose of git config command?
Answer: We can set the design choices for GIT establishment by utilizing git config order.

AWS vs Azure vs GCP


                             AWS vs Azure vs GCP 

Error CM Server guid updated | CDH 5.14

Error Message :

[19/Jun/2018 02:05:43 +0000] 4343 MainThread agent        INFO     CM server guid: 2994b67a-d726-4b5f-a389-6e17214dcf00
[19/Jun/2018 02:05:43 +0000] 4343 MainThread agent        INFO     Using parcels directory from server provided value: /opt/cloudera/parcels
[19/Jun/2018 02:05:43 +0000] 4343 MainThread parcel       INFO     Agent does create users/groups and apply file permissions
[19/Jun/2018 02:05:43 +0000] 4343 MainThread parcel_cache INFO     Using /opt/cloudera/parcel-cache for parcel cache
[19/Jun/2018 02:05:43 +0000] 4343 MainThread agent        ERROR    Error, CM server guid updated, expected 2994b67a-d726-4b5f-a389-6e17214dcf00, received 88ac9a7c-0b7a-48af-a8d0-93e67b496056

Solution :

Fixed it by deleting /var/lib/cloudera-scm-agent/cm_guid on each node.

Restart cloudera agent .

Migrating File Based Sentry Policies to Sentry Server

  1. Install Sentry service 
  2. Remove the configuration for Hive or Impala to use Sentry Policy files:
In Cloudera Manager for Hive:
  1. Navigate to Hive > Configuration > Service-Wide > Policy File Based Sentry > Enable Sentry Authorization using Policy Files
  2. Uncheck the box.
In Cloudera Manager for Impala:
  1. Navigate to Impala > Configuration > Service-Wide > Policy File Based Sentry > Enable Sentry Authorization using Policy Files 
  2. Uncheck the box.
  1. Enable the Sentry Service:
In Cloudera Manager for Hive:
  1. Navigate to Hive > Configuration > Service-Wide > Sentry Service
  2. Click on the radio button for the Sentry Service
In Cloudera Manager for Impala:
  1. Navigate to Impala > Configuration > Service-Wide > Sentry Service
  2. Click on the radio button for the Sentry Service
  1. Stop the Sentry Service:
  1. Back up the Sentry database. The following steps will write data into the Sentry database.
  2. Import the settings by running the following commands on the node where HiveServer2 is running:
    1. ​Set HIVE_HOME location in order to have Sentry commands working.
This should contain bin/hive (typically /usr/lib/hive or under /opt/cloudera/parcels export HIVE_HOME=/usr/lib/hive).
export HIVE_HOME=/opt/cloudera/parcels/CDH/lib/hive
  1. Validate the existing Sentry Provider.INI file to make sure it does not have any errors using the example syntax here:
    sentry --hive-config /etc/hive/conf --command config-tool -s file:///etc/sentry/conf/sentry-site.xml -i hdfs://nameservice1/user/hive/sentry/sentry-provider.ini -v
Note : If you get error like below:
Sentry server: HS2 Found configuration problems ERROR: Error processing file hdfs://nameservice1/user/hive/sentry/sentry-provider.iniServer name server1 in server=server1 is invalid. Expected HS2 ERROR: Failed to process global policy file hdfs://nameservice1/user/hive/sentry/sentry-provider.ini
It implies, that Sentry is expecting its server name to be HS2 by default. So you would need to specify its server name as server1 (as specified in sentry-provider.ini file).
In order to do that, provide the below snippet in this value Sentry Service Advanced Configuration Snippet (Safety Valve) for sentry-site.xml and do a restart:
Ensure that, the same is reflected in sentry-site.xml in /etc/sentry/conf/sentry-site.xml on the host where Sentry is installed. If it does not take effect, copy the sentry-site.xml from the Cloudera Manager process section and create a new sentry-site.xml in the home location with that information and reference it in the above syntax to validate as below.
  1. Set HIVE_CONF_DIR - This contains hive-site and sentry-site for Hive. For Cloudera Manager deployed systems it is set as follows:
    export HIVE_CONF_DIR="/var/run/cloudera-scm-agent/process/`ls -alrt /var/run/cloudera-scm-agent/process | grep HIVESERVER2 | tail -1 | awk '{print $9}'`"
  2. Run the Sentry config-tool:
    sentry --hive-config /etc/hive/conf --command config-tool --import --policyIni hdfs://nameservice1/user/hive/sentry/sentry-provider.ini -s file:///home/subbav/sentry-site.xml
    Important: The policy file should be fully qualified URI, For example:
hdfs://namenode:8020/user/hive/sentry/sentry-provider.ini or file:///local/data/sentry/sentry-provider.ini
sentry --command config-tool --import -i <Policy_file_URI>
  1. Start the Sentry Service
  2. Run commands in Beeline to test if privileges are set correctly.

Dr Elephant Installation on Linux - Cloudera - Part 4

Dr Elephant Installation 

·        Clone Dr.elephant :

[ drelephant]$ pwd

mkdir dr-ele       

Now Inside à /dr-ele/app/com/linkedin/drelephant/analysis/ change the following values of Resource Manager from  http: to https:/

yarn.resourcemanager.webapp.http.address to yarn.resourcemanager.webapp.https.address

mani@node1.manilab analysis]$ cat | grep https
  private static final String RESOURCE_MANAGER_ADDRESS = "yarn.resourcemanager.webapp.https.address";
  private static final String RM_NODE_STATE_URL = "https://%s/ws/v1/cluster/info";
    URL succeededAppsURL = new URL(new URL("https://" + _resourceManagerAddress), String.format(
    URL failedAppsURL = new URL(new URL("https://" + _resourceManagerAddress), String.format(

·        Compile Dr.elephant

[ dr-ele]$ pwd

[ dr-ele]$ ./ ./compile.conf

·        Now inside ‘dist’ folder will be created .Extract zip file inside /dist. dist]$ pwd

[ dist]$ ll
total 88148
drwxr-xr-x 8 mani manidevl      131 Jan 18 07:42 dr-elephant-2.0.13
-rw-r--r-- 1 mani manidevl 90259919 Jan 18 07:33
Extract it.

Everything happens inside the extracted folder now.

·        Now inside /dr-ele/dist/dr-elephant-2.0.13/app-conf/  in the elephant.conf change the following values :


# Database configuration


jvm_args="-Devolutionplugin=disabled -DapplyEvolutions.default=false -mem 1024 -J-Xloggc:$project_root../logs/elephant/dr-gc.`date +'%Y%m%d%H%M'` -J-XX:+PrintGCDetails"
give permission 777 to hadoop fs –ls /user/history/done

·        Start Dr.elephant

./ ../app-conf/
Make sure dr.elephant started without any errors, check the dr.log

·        Go to Dr.elephant UI

Change the hostname/ip according to your env, you should be able to see the Dr.elephant dashboard

In our case it is:

·        Run a sample Hadoop job

 After the job completes you could see the analysis on Dr.elephant UI