Install and configure Apache Phoenix on Cloudera Hadoop CDH5



          Apache Phoenix is a relational database layer over HBase delivered as a client-embedded JDBC driver targeting low latency queries over HBase data. Apache Phoenix takes your SQL query, compiles it into a series of HBase scans, and orchestrates the running of those scans to produce regular JDBC result sets. The table metadata is stored in an HBase table and versioned, such that snapshot queries over prior versions will automatically use the correct schema. Direct use of the HBase API, along with coprocessors and custom filters, results in performance on the order of milliseconds for small queries, or seconds for tens of millions of rows.

Step 1: Download Latest version of Phoenix using command given below


--2015-11-23 12:20:21-- http://mirror.reverse.net/pub/apache/phoenix/phoenix-4.3.1/bin/phoenix-4.3.1-bin.tar.gz
Resolving mirror.reverse.net... 208.100.14.200
Connecting to mirror.reverse.net|208.100.14.200|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 72155049 (69M) [application/x-gzip]
Saving to: “phoenix-4.3.1-bin.tar.gz.1”
100%[=====================================] 72,155,049   614K/s   in 2m 15s
2015-04-10 12:25:45 (521 KB/s) - “phoenix-4.3.1-bin.tar.gz.1” saved [72155049/72155049]

Step 2: Extract the downloaded tar file to convenient location

[root@maniadmin ~]# tar -zxvf phoenix-4.3.1-bin.tar.gz
phoenix-4.3.1-bin/bin/hadoop-metrics2-phoenix.properties
-
-
phoenix-4.3.1-bin/examples/WEB_STAT.sql

Step 3: Copy phoenix-4.3.1-server.jar to hbase libs on each reagion server and master server


On master server you should copy “phoenix-4.3.1-server.jar” at “/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/lib/” location


On Hbase region server you should copy “phoenix-4.3.1-server.jar” at /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/lib/ location


Step 4: Copy phoenix-4.3.1-client.jar to each Hbase region server

Please make sure to have phoenix-4.3.1-client.jar at /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/lib/ on each region sever.


Step 5: Restart hbase services via Cloudera manager

Step 6: Testing – Goto extracted_dir/bin and run below command


[root@maniadmin bin]# ./psql.py localhost ../examples/WEB_STAT.sql ../examples/WEB_STAT.csv ../examples/WEB_STAT_QUERIES.sql 
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
15/11/23 13:51:05 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
no rows upserted
Time: 2.297 sec(s)
csv columns from database.
CSV Upsert complete. 39 rows upserted
Time: 0.554 sec(s)
DOMAIN                                                         AVERAGE_CPU_USAGE                         AVERAGE_DB_USAGE---------------------------------------- ---------------------------------------- ----------------------------------------
Salesforce.com                                                           260.727                                 257.636
Google.com                                                              212.875                                   213.75
Apple.com                                                                 114.111                                 119.556
Time: 0.2 sec(s)
DAY                                             TOTAL_CPU_USAGE                           MIN_CPU_USAGE                           MAX_CPU_USAGE
----------------------- ---------------------------------------- ---------------------------------------- ----------------------------------------
2013-01-01 00:00:00.000                                       35                                       35                                       35
2013-01-02 00:00:00.000                                     150                                       25                                      125
2013-01-03 00:00:00.000                                       88                                       88                                       88
-
-
2013-01-04 00:00:00.000                                       26                                      3                                       232013-01-05 00:00:00.000                                     550                                       75                                     475
Time: 0.09 sec(s)
HO                   TOTAL_ACTIVE_VISITORS
-- ----------------------------------------
EU                                     150
NA                                       1
Time: 0.052 sec(s)
Done.








Step 7: To get sql shell


[root@maniadmin bin]# ./sqlline.py localhost
Setting property: [isolation, TRANSACTION_READ_COMMITTED]
issuing: !connect jdbc:phoenix:localhost none none org.apache.phoenix.jdbc.PhoenixDriver
Connecting to jdbc:phoenix:localhost
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
15/11/23 14:58:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Connected to: Phoenix (version 4.3)
Driver: PhoenixEmbeddedDriver (version 4.3)
Autocommit status: true
Transaction isolation: TRANSACTION_READ_COMMITTED
Building list of tables and columns for tab-completion (set fastconnect to true to skip)...
77/77 (100%) Done
Done
sqlline version 1.1.8
0: jdbc:phoenix:localhost>



Amazon in-house interview questions

Once you clear the 2/3 telephonic rounds they will invite you for in-house interviews ( F2F or Video conference  )  to the nearest amazon office .

You will meet with 5-6 Amazonians. The mix of interviewers will include managers and peers that make up the technical team.

Each meeting will be one-on-one interview sessions lasting approximately 45-60 minutes ( approx 5 hrs).

In my case it was a video conference round , below are few behavioral  questions covered in all the 5 rounds .


1) Why Amazon?
2)What is your understanding about the role?
3)What do you wish to change in your current environment?
4)What is the customer interaction that you are most proud of?
5)Describe me your most difficult customer interaction?
6)Tell me about a time you made a significant mistake . What would you have done differently?
7)Give an example of a tough or critical piece of feedback you received. What was it and what did you do about it?
8)Describe a time when you needed the cooperation of a peer or peers who were resistant to what you were trying to do. What did you do? What was the outcome?
9)Saw a Peer Struggling and what did you do to help?
10)Give me an example of when you have to make an important decision in the absence of good data because there just wasn’t any. What was the situation and how did you arrive at your decision? Did the decision turn out to be the correct one? Why or why not?
11)Tell me about a time you took a big risk. What was the situation?
12)Give me an example of a time when you were able to deliver an important project under a tight deadline. What sacrifices did you have to make to meet the deadline? How did they impact the final deliverable s?