This is the documentation for Cloudera Enterprise 5.8.x. Documentation for other versions is available at Cloudera Documentation.

Configuring Short-Circuit Reads

So-called "short-circuit" reads bypass the DataNode, allowing a client to read the file directly, as long as the client is co-located with the data. Short-circuit reads provide a substantial performance boost to many applications and help improve HBase random read profile and Impala performance.

Short-circuit reads require libhadoop.so (the Hadoop Native Library) to be accessible to both the server and the client. libhadoop.so is not available if you have installed from a tarball. You must install from an .rpm, .deb, or parcel to use short-circuit local reads.

  Note: If you are configuring short-circuit reads on a CDH cluster that uses the DSSD Storage Appliance, see DSSD D5 and Short-Circuit Reads.

Configuring Short-Circuit Reads Using Cloudera Manager

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

  Note: Short-circuit reads are enabled by default in Cloudera Manager.
  1. Go to the HDFS service.
  2. Click the Configuration tab.
  3. Select Scope > Gateway or HDFS (Service-Wide).
  4. Select Category > Performance.
  5. Locate the Enable HDFS Short Circuit Read property or search for it by typing its name in the Search box. Check the box to enable it.

    If more than one role group applies to this configuration, edit the value for the appropriate role group. See Modifying Configuration Properties Using Cloudera Manager.

  6. Click Save Changes to commit the changes.

Configuring Short-Circuit Reads Using the Command Line

  Important:
  • If you use Cloudera Manager, do not use these command-line instructions.
  • This information applies specifically to CDH 5.8.x. If you use a lower version of CDH, see the documentation for that version located at Cloudera Documentation.
Configure the following properties in hdfs-site.xml to enable short-circuit reads in a cluster that is not managed by Cloudera Manager:
<property>
    <name>dfs.client.read.shortcircuit</name>
    <value>true</value>
</property>

<property>
    <name>dfs.client.read.shortcircuit.streams.cache.size</name>
    <value>1000</value>
</property>


<property>
    <name>dfs.client.read.shortcircuit.streams.cache.expiry.ms</name>
    <value>10000</value>
</property>

<property>
    <name>dfs.domain.socket.path</name>
    <value>/var/run/hadoop-hdfs/dn._PORT</value>
</property>
  Note: The text _PORT appears just as shown; you do not need to substitute a number.

If /var/run/hadoop-hdfs/ is group-writable, make sure its group is root.

Page generated July 8, 2016.