Use SSH Tunneling to access Azure HDInsight Hive Server 2 ODBC/JDBC endpoint

Why use an SSH tunnel?

If you are researching the topic of using an SSH tunnel to access Azure HDInsight, you would have stumbled upon this article on why and how to set it up.

Here is another reason why you would want to SSH tunnel in HDInsight. Let’s say that you want to establish an ODBC/JDBC connection via SSH Tunnel to the Hive Server 2 endpoint on your HDInsight cluster. My reason for doing this is very specific to an issue – 502 Bad Gateway error returned from HDInsight cluster when I execute a Hive query. If you have the same issue, please continue reading.

What is the root cause of 502 Bad Gateway Error returned from HDInsight?

502 Bad Gateway Error can be returned from the HDInsight internal central gateway. It is an expected behaviour from the HDInsight central gateway when there is a connection timeout of 2 minutes before it disconnects a connection that is considered inactive. When the Hive client initiates a request, say via a Hive ODBC/JDBC driver, the central gateway acts as a reverse proxy and routes the request to the Hive server component in the HDInsight cluster. Once the Hive server starts processing the request, a response (not the query results) needs to be sent back to the client. If that does not happen within the timeout of 2 minutes, a 502 Bad Gateway error is returned to the client.

Due to the intermittent nature of the issue I faced, it is possible that some queries result in Hive server are taking more than 2 minutes to respond. Data-refresh operations can take some time depending on a number of factors including data volume, level of optimisation using partitions, etc. Running data refresh through the Hive ODBC driver could result in often unreliable, long-running HTTPs connections.

The 502 Bad Gateway Error can be mitigated by bypassing the HDInsight internal central gateway. Bypassing the HDInsight central gateway is possible by establishing an SSH tunnel between the host which runs the client (via the Hive ODBC Driver) and the HDInsight head node. HDInsight uses the Hortonworks Data Platform (HDP) Hadoop distribution. There is an article from Hortonworks which explains how establish an ODBC/JDBC connection via SSH tunnel.

The Hive query will continue without 502 Bad Gateway error when it bypasses the HDInsight central gateway via the SSH tunnel. In order to perform query optimisation, a reference is available here: https://hortonworks.com/blog/5-ways-make-hive-queries-run-faster/  

HOW TO create SSH tunneling

Find out what is the transport mode of your HDInsight Hive Server 2

Go to Ambari -> Hive -> Configs -> Advanced

Look for hive.server2.transport.mode. If the value is HTTP this means Hive Server 2 is running on port 10001 on the head node. If transport mode is binary, it is running on port 10000 on the head node. This is the Hive Server 2 port used for both ODBC and JDBC connections.

Please make sure that hive.server2.transport.mode =  http because the Hive ODBC/JDBC driver uses HTTP to connect to the Hive Server 2.

Create an SSH tunnel from Hive ODBC client host to the HDInsight head node by doing this

If you are running Windows Server 2019 or Windows 10 1809 on the host with the Hive ODBC driver, please follow this article to install OpenSSH. Otherwise please install Git Bash.

ssh -L 10001:[insert head node FQDN or IP address here]:10001 sshuser@
[insert head node FQDN or IP address here]

In the ODBC Data Sources (64-bit), configure a user DSN as the following:

In order to disable SSL, you must select “User Name and Password” as the authentication mechanism.

You must disable SSL because you are connecting to a private/internal endpoint of your HDInsight head node

Click Test

You have just bypassed the HDInsight internal gateway using SSH tunneling.

Building Docker image with Azure IoT Edge cross-compiled libraries for Raspberry Pi

TL;DR – In my last article, I wrote about the steps it takes to build a Docker image with cross-compiled native libraries for arm-hf/arm32/Raspbian/RaspberryPi and .NET Core 2.0.0 DLLs compiled for linux-arm. However it takes too many manual steps upon running your own container. A better practice is to build a Dockerfile which you can download.

How to get started?

  1. Run this on your dev machine, not your target Raspberry Pi. This is because the Docker image is based upon Debian 8 (jessie) x86-64 GNU/Linux. This is the environment needed to run the RPiToolchain as well as the .NET Core 2.0.0 SDK.
  2. Git clone my repo from here.
  3. Build the Docker image by the following command:
docker build -t iot-edge-rpi .

As a result of the cross-compilation and .NET Core 2.0.0 compilation, you get a tar ball which you can copy onto your target Raspberry Pi. The cleanest way to do so is by mounting a host directory onto your container before running it. Then copy the iot-edge-rpi.tar.gz to /mnt.

docker run -v /home/username:/mnt --name iot-edge-rpi -it iot-edge-rpi

Copy iot-edge-rpi.tar.gz to your target Raspberry Pi. Easiest way is to use SCP.

docker build -t iot-edge-rpi .

Radio silence

When I started this blog, I wanted to write regularly with at least a blog post per month, but October has been a really quiet month from me. The reason for the radio silence is that I moved between suburbs in Sydney. I also moved between continents, between Malaysia and Australia. I have also re-joined Microsoft as a Technology Solutions Professional for IoT in the Asia Incubation team. Fun times ahead and the passion for anything IoT related continues.

A recent announcement that got me stoked about was the Microsoft Band. It’s more than just a fitness tracker because as this blog post rightly puts it, it’s the power of IoT on your wrist. I can’t wait to wear one on my wrist.

This device would be a great addition to my Project GetFitY’all. It’s really about the internet of my things when I began this project, and the more devices connected to this project, the merrier it is.  Over time I intend to connect more devices, or sensor gadgets. I also need to replace the mems sensor board because the temperature stopped working. I think I might have “killed” the function. All it returns now are the following:

MPL3115:        Alt. 0.0        Temp: 0.0
MPL3115:        Alt. 0.0        Temp: 0.0
MPL3115:        Alt. 0.0        Temp: 0.0
MPL3115:        Alt. 0.0        Temp: 0.0

No worries, I probably will buy a few other boards just as backup. Meanwhile it’s back to work. Let’s rock on!

RaspberryWiFai getting mobile

Raspberryfai has transformed into RaspberryWiFai, and it is can’t wait to go out there under the spring sun! I’d procrastinated for awhile in opening up the clear case and plugging in the Xtrinsic-Sense board that I purchased together. Initially I wanted to buy a downgrade GPIO cable from 40 pins to 26 pins so that the cable could nicely slip out of the clear case. I could order from Adafruit but this means that it would be an international shipment and it would cost me more than the inexpensive cables. I checked out a local electronics mart, but they didn’t sell it. Searched more online stores that would deliver fast around here, and none sell this cable.

Now that RaspberryWiFai is more mobile due to a working WiFi USB adapter, why not power it up using my 19,200 mAh power bank, open up the clear case top, plug in the Xtrinsic -Sense board, and will it out for a drive tomorrow. Here’s how RaspberryWiFai looks like with the dropped top.

raspberrywifai

However the funny thing was that I didn’t know which was GPIO PIN 1. The Xtrinsic-Sense board has 26 pin to work with the older RaspPi A and B, but RaspPi B+ has 40 pins. It wasn’t labelled on the RaspPi. I looked through GPIO layout diagrams for the RaspPi B+ but couldn’t find anything until I came across a comment from Matt the author of this blog, who said “if you look on the reverse of the PCB Pin 1 has a square pad and the others have round pads.”. Got it!

Without further ado, I executed the Python script for getting the temperature/pressure. The sun was up, and still is so it would be pretty good to show how temperature changes when I moved RaspberryWiFai  from inside my unit to the balcony which is baking hot from the spring sun. Here’s the output showing the temperature rising (I’d condensed the output for brevity sake here):

pi@raspberryfai ~/rpi_sensor_board $ sudo python mpl3115a2.py
MPL3115: Alt. -59.888 Temp: 23.176
MPL3115: Alt. -60.04 Temp: 23.192
MPL3115: Alt. -60.776 Temp: 23.176
MPL3115: Alt. -59.888 Temp: 23.16
MPL3115: Alt. -57.872 Temp: 23.192
MPL3115: Alt. -58.824 Temp: 23.208
MPL3115: Alt. -58.0 Temp: 23.192
MPL3115: Alt. -54.2 Temp: 23.16
MPL3115: Alt. -54.68 Temp: 23.144
MPL3115: Alt. -57.856 Temp: 24.16
MPL3115: Alt. -58.808 Temp: 24.64
MPL3115: Alt. -58.824 Temp: 24.8
MPL3115: Alt. -57.84 Temp: 24.96
MPL3115: Alt. -58.84 Temp: 24.128
MPL3115: Alt. -57.84 Temp: 24.16
MPL3115: Alt. -58.792 Temp: 24.176
MPL3115: Alt. -57.808 Temp: 25.0
MPL3115: Alt. -57.84 Temp: 25.32
MPL3115: Alt. -57.2 Temp: 25.48
MPL3115: Alt. -57.2 Temp: 25.64
MPL3115: Alt. -58.888 Temp: 25.8
MPL3115: Alt. -58.04 Temp: 25.96
MPL3115: Alt. -58.856 Temp: 25.112
MPL3115: Alt. -57.84 Temp: 25.128
MPL3115: Alt. -57.84 Temp: 25.144
MPL3115: Alt. -57.2 Temp: 25.16
MPL3115: Alt. -54.856 Temp: 25.208
MPL3115: Alt. -54.84 Temp: 25.192
MPL3115: Alt. -54.52 Temp: 25.224
MPL3115: Alt. -54.68 Temp: 25.24
MPL3115: Alt. -55.0 Temp: 25.0
MPL3115: Alt. -55.856 Temp: 26.16
MPL3115: Alt. -55.856 Temp: 26.32
MPL3115: Alt. -56.76 Temp: 26.48
MPL3115: Alt. -55.36 Temp: 26.64
MPL3115: Alt. -55.04 Temp: 26.8
MPL3115: Alt. -55.2 Temp: 26.96
MPL3115: Alt. -55.856 Temp: 26.112
MPL3115: Alt. -55.2 Temp: 26.128
MPL3115: Alt. -55.36 Temp: 26.144
MPL3115: Alt. -56.808 Temp: 26.16
MPL3115: Alt. -59.872 Temp: 26.112
MPL3115: Alt. -60.76 Temp: 26.8
MPL3115: Alt. -60.76 Temp: 26.64

Indeed RaspberryWiFai was basking in the sun, from a low of 23.176 degrees C, it went up to 26.96 degrees C under the hot sun. Smoking hot! Time to wire it up to my ISS, and do some cool things like raise an alarm when it is too hot.