本篇內容介紹了“hadoop-2.6.2 lzo的配置過程”的有關知識,在實際案例的操作過程中,不少人都會遇到這樣的困境,接下來就讓小編帶領大家學習一下如何處理這些情況吧!希望大家仔細閱讀,能夠學有所成!
集群有三臺主機,主機名分別是:bi10,bi12,bi13。我們的操作都在bi10上面進行。
安裝lzo需要一些依賴包,如果你已經安裝過了,那么可以跳過這一步。首先你需要切換到root用戶下
yum install gcc gcc-c++ kernel-devel yum install git
除了以上兩個之外,你還需要配置maven環境,下載之后直接解壓并配置環境變量即可使用
wget http://apache.fayea.com/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz tar -xzf apache-maven-3.3.9-bin.tar.gz
配置maven環境變量,maven軟件包放置到/home/hadoop/work/apache-maven-3.3.9
[hadoop@bi10 hadoop-2.6.2]$ vim ~/.bash_profile #init maven environment export MAVEN_HOME=/home/hadoop/work/apache-maven-3.3.9 export PATH=$PATH:$MAVEN_HOME/bin
下載lzo安裝包
[hadoop@bi10 apps]$ wget http://www.oberhumer.com/opensource/lzo/download/lzo-2.09.tar.gz
解壓并編譯安裝lzo到:/usr/local/hadoop/lzo/,安裝時切換到root用戶下
[hadoop@bi10 apps]$ tar -xzf lzo-2.09.tar.gz [hadoop@bi10 apps]$ cd lzo-2.09 [hadoop@bi10 apps]$ su root [root@bi10 lzo-2.09]$ ./configure -enable-shared -prefix=/usr/local/hadoop/lzo/ [root@bi10 lzo-2.09]$ make && make test && make install
查看安裝目錄
[hadoop@bi10 lzo-2.09]$ ls /usr/local/hadoop/lzo/ include lib share
下載hadoop-lzo
git clone https://github.com/twitter/hadoop-lzo.git
設置環境變量,并使用maven編譯
[hadoop@bi10 hadoop-lzo]$ export CFLAGS=-m64 [hadoop@bi10 hadoop-lzo]$ export CXXFLAGS=-m64 [hadoop@bi10 hadoop-lzo]$ export C_INCLUDE_PATH=/usr/local/hadoop/lzo/include [hadoop@bi10 hadoop-lzo]$ export LIBRARY_PATH=/usr/local/hadoop/lzo/lib [hadoop@bi10 hadoop-lzo]$ mvn clean package -Dmaven.test.skip=true
將編譯好的文件拷貝到hadoop的安裝目錄
[hadoop@bi10 hadoop-lzo]$ tar -cBf - -C target/native/Linux-amd64-64/lib . | tar -xBvf - -C $HADOOP_HOME/lib/native/ [hadoop@bi10 hadoop-lzo]$ cp target/hadoop-lzo-0.4.20-SNAPSHOT.jar $HADOOP_HOME/share/hadoop/common/ [hadoop@bi10 hadoop-lzo]$ scp target/hadoop-lzo-0.4.20-SNAPSHOT.jar bi12:$HADOOP_HOME/share/hadoop/common/ [hadoop@bi10 hadoop-lzo]$ scp target/hadoop-lzo-0.4.20-SNAPSHOT.jar bi13:$HADOOP_HOME/share/hadoop/common/
將編譯好的文件分別復制到集群其他機器對應的目錄,其中native目錄需要先打包再拷貝到集群的其他機器上,然后解壓。
tar -czf hadoop-native.tar.gz /$HADOOP_HOME/lib/native/ scp hadoop-native.tar.gz bi12:/$HADOOP_HOME/lib scp hadoop-native.tar.gz bi13:/$HADOOP_HOME/lib
修改hadoop-env.sh,增加一條
# The lzo library export LD_LIBRARY_PATH=/usr/local/hadoop/lzo/lib
修改core-site.xml
<property> <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.BZip2Codec</value> </property> <property> <name>io.compression.codec.lzo.class</name> <value>com.hadoop.compression.lzo.LzoCodec</value> </property>
修改mapred-site.xml
<!-- lzo壓縮 --> <property> <name>mapred.compress.map.output</name> <value>true</value> </property> <property> <name>mapred.map.output.compression.codec</name> <value>com.hadoop.compression.lzo.LzoCodec</value> </property> <property> <name>mapred.child.env</name> <value>LD_LIBRARY_PATH=/usr/local/hadoop/lzo/lib</value> </property>
拷貝三個配置文件到集群其他機器
scp etc/hadoop/hadoop-env.sh bi12:/home/hadoop/work/hadoop-2.6.2/etc/hadoop/ scp etc/hadoop/hadoop-env.sh bi13:/home/hadoop/work/hadoop-2.6.2/etc/hadoop/ scp etc/hadoop/core-site.xml bi12:/home/hadoop/work/hadoop-2.6.2/etc/hadoop/ scp etc/hadoop/core-site.xml bi13:/home/hadoop/work/hadoop-2.6.2/etc/hadoop/ scp etc/hadoop/mapred-site.xml bi12:/home/hadoop/work/hadoop-2.6.2/etc/hadoop/ scp etc/hadoop/mapred-site.xml bi13:/home/hadoop/work/hadoop-2.6.2/etc/hadoop/
安裝lzop,需要切換到root用戶下
yum install lzop
進入hadoop安裝目錄然后對LICENSE.txt執行lzo壓縮,會生成一個lzo壓縮文件LICENSE.txt.lzo
lzop LICENSE.txt
上傳壓縮文件到hdfs
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -mkdir /user/hadoop/wordcount/lzoinput [hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -put LICENSE.txt.lzo /user/hadoop/wordcount/lzoinput [hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -ls /user/hadoop/wordcount/lzoinput Found 1 items -rw-r--r-- 2 hadoop supergroup 7773 2016-02-16 20:59 /user/hadoop/wordcount/lzoinput/LICENSE.txt.lzo
對lzo壓縮文件建立索引
hadoop jar ./share/hadoop/common/hadoop-lzo-0.4.20-SNAPSHOT.jar com.hadoop.compression.lzo.DistributedLzoIndexer /user/hadoop/wordcount/lzoinput/ [hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -ls /user/hadoop/wordcount/lzoinput/ Found 2 items -rw-r--r-- 2 hadoop supergroup 7773 2016-02-16 20:59 /user/hadoop/wordcount/lzoinput/LICENSE.txt.lzo -rw-r--r-- 2 hadoop supergroup 8 2016-02-16 21:02 /user/hadoop/wordcount/lzoinput/LICENSE.txt.lzo.index
對lzo壓縮文件執行wordcount
hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.2.jar wordcount /user/hadoop/wordcount/lzoinput/ /user/hadoop/wordcount/output2
“hadoop-2.6.2 lzo的配置過程”的內容就介紹到這里了,感謝大家的閱讀。如果想了解更多行業相關的知識可以關注億速云網站,小編將為大家輸出更多高質量的實用文章!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。