關(guān)于我們
書單推薦
新書推薦
|
HBase與Hive數(shù)據(jù)倉庫應(yīng)用開發(fā) 讀者對象:高職高專院校計算機相關(guān)專業(yè)、人工智能相關(guān)專業(yè)
本書使用大數(shù)據(jù)存儲常用工具與真實場景案例相結(jié)合的方式,以項目任務(wù)式為導(dǎo)向,較為全面地介紹了HBase分布式數(shù)據(jù)庫與Hive分布式數(shù)據(jù)倉庫的相關(guān)知識。全書共9個項目,內(nèi)容包括認識數(shù)據(jù)庫與數(shù)據(jù)倉庫、安裝與配置HBase列存儲數(shù)據(jù)庫、使用HBase Shell構(gòu)建博客數(shù)據(jù)庫系統(tǒng)、使用HBase Java API實現(xiàn)博客數(shù)據(jù)庫系統(tǒng)的應(yīng)用開發(fā)、安裝與配置Hive結(jié)構(gòu)化數(shù)據(jù)倉庫、使用Hive實現(xiàn)數(shù)據(jù)定義操作、使用Hive Shell實現(xiàn)用戶優(yōu)惠券數(shù)據(jù)分析及處理、使用Hive Java API實現(xiàn)用戶優(yōu)惠券分析的應(yīng)用開發(fā)、以及如何綜合使用Hive與HBase存儲技術(shù)實現(xiàn)電信運營商流失用戶分析。本書大部分項目都設(shè)置了任務(wù)實訓(xùn)及課后習(xí)題,通過練習(xí)和操作實踐,可以幫助讀者鞏固所學(xué)的內(nèi)容,快速掌握書中所介紹的HBase與Hive存儲工具的操作。本書可以作為高校大數(shù)據(jù)技術(shù)相關(guān)專業(yè)的教材,也可作為大數(shù)據(jù)技術(shù)或數(shù)據(jù)庫愛好者的自學(xué)用書。希望通過學(xué)習(xí)本書內(nèi)容,讀者在提升大數(shù)據(jù)存儲技術(shù)的應(yīng)用能力的同時,也能夠養(yǎng)成自主學(xué)習(xí)的意識,提高發(fā)現(xiàn)問題、分析問題和解決問題的能力,具備良好的問題分析素養(yǎng)和獨立思考能力,并養(yǎng)成敬業(yè)、精益、專注的工匠精神。
唐美霞,女,生于1982年03月,2003年畢業(yè)于湖南師范大學(xué)計算機科學(xué)與技術(shù)專業(yè)(本科),2010年畢業(yè)于湖南科技大學(xué)計算機應(yīng)用技術(shù)專業(yè)(碩士研究生),中共黨員,大數(shù)據(jù)技術(shù)專業(yè)教師、工學(xué)碩士、副教授、南寧市高層次人才(E類)、華為ICT學(xué)院大數(shù)據(jù)講師,新華三公司大數(shù)據(jù)平臺運維認證講師,教育廳認定的"雙師”型教師。參加工作20年以來,曾主講《C#程序設(shè)計》、《數(shù)據(jù)結(jié)構(gòu)》、《MySQL數(shù)據(jù)庫》、《Linux操作系統(tǒng)》、《數(shù)據(jù)挖掘與機器學(xué)習(xí)》、《Hbase與Hive數(shù)據(jù)倉庫應(yīng)用開發(fā)》、《Hadoop開發(fā)基礎(chǔ)》等課程;在教學(xué)工作中堅持教育教學(xué)改革,不斷探索新的教學(xué)模式和教學(xué)方法,取得了良好的效果;主編《Java程序設(shè)計》一部,《SQL Server 2012 數(shù)據(jù)庫原理及應(yīng)用》一部,參編《Flash CS5實用案例教程》一部,2020年12月,參加全國高等院校計算機類專業(yè)教學(xué)能力大賽獲二等獎(作品:Hadoop開發(fā)基礎(chǔ));2017年-2021年指導(dǎo)學(xué)生參加全國職業(yè)院校技能大賽高職組"大數(shù)據(jù)技術(shù)與應(yīng)用”賽項,2017年獲區(qū)賽一等獎,全國三等獎,2018年-2021年獲區(qū)賽二等獎;2019年指導(dǎo)學(xué)生參加中國軟件杯全國大學(xué)生軟件設(shè)計大賽榮獲全國三等獎;2019-2020年連續(xù)兩年指導(dǎo)學(xué)生參加廣西大學(xué)生人工智能設(shè)計大賽,榮獲一等獎,優(yōu)秀指導(dǎo)教師。2019-2020年連續(xù)兩年指導(dǎo)學(xué)生參加全國大學(xué)生數(shù)學(xué)建模競賽廣西選拔賽榮獲三等獎。秉承"以科研促教學(xué),以教學(xué)帶科研”的理念,堅持進行學(xué)術(shù)研究,主要研究方向為軟件技術(shù)、大數(shù)據(jù)技術(shù)、算法分析。近年來在《計算機工程與設(shè)計》、《湖南科技大學(xué)學(xué)報(自然科學(xué)版)》、《制造業(yè)自動化》、《廣西教育》等刊物上發(fā)表學(xué)術(shù)論文20余篇,其中核心論文6篇,主持完成市廳級課題3項,主持在研市廳級課題2項,參與省部級科研課題2項,參與完成市廳級課題6項,獲得發(fā)明專利1項,實用新型專利1項,軟件著作權(quán)16項。
項目1 認識數(shù)據(jù)庫與數(shù)據(jù)倉庫 ············································································.1
【教學(xué)目標(biāo)】 ······························································································.1 【背景描述】 ······························································································.1 任務(wù) 1 了解大數(shù)據(jù) ·····················································································.2 【任務(wù)描述】 ···················································································.2 【任務(wù)要求】 ···················································································.2 【相關(guān)知識】 ···················································································.2 1.1.1 大數(shù)據(jù)的概念及發(fā)展歷程 ····················································.2 1.1.2 大數(shù)據(jù)的數(shù)據(jù)類型 ·····························································.2 1.1.3 大數(shù)據(jù)的特點 ···································································.3 1.1.4 大數(shù)據(jù)的行業(yè)應(yīng)用 ·····························································.4 1.1.5 大數(shù)據(jù)的技術(shù)體系 ·····························································.5 任務(wù) 2 了解大數(shù)據(jù)存儲技術(shù) ·········································································.7 【任務(wù)描述】 ···················································································.7 【任務(wù)要求】 ···················································································.7 【相關(guān)知識】 ···················································································.7 1.2.1 大數(shù)據(jù)存儲簡介 ································································.7 1.2.2 基于文件系統(tǒng)的數(shù)據(jù)存儲 ····················································.8 1.2.3 基于數(shù)據(jù)庫的數(shù)據(jù)存儲 ·······················································.9 1.2.4 基于數(shù)據(jù)倉庫的數(shù)據(jù)存儲 ··················································.10 項目總結(jié) ·································································································.10 課后習(xí)題 ·································································································.11 項目2 安裝與部署HBase ················································································.12 【教學(xué)目標(biāo)】 ····························································································.12 【背景描述】 ····························································································.12 任務(wù) 1 搭建完全分布式 Hadoop 集群 ····························································.13 【任務(wù)描述】 ·················································································.13 【任務(wù)要求】 ·················································································.13 【相關(guān)知識】 ·················································································.13 2.1.1 Hadoop 簡介 ···································································.13 2.1.2 Hadoop 的核心組件 ··························································.14 2.1.3 Hadoop 生態(tài)系統(tǒng) ·····························································.17 2.1.4 搭建 Hadoop 集群前的準備工作 ··········································.19 【任務(wù)實施】 ·················································································.19 【任務(wù)實訓(xùn)】 ·················································································.47 任務(wù) 2 安裝 ZooKeeper 集群 ·······································································.47 【任務(wù)描述】 ·················································································.47 【任務(wù)要求】 ·················································································.47 【相關(guān)知識】 ·················································································.48 2.2.1 ZooKeeper 簡介 ·······························································.48 2.2.2 ZooKeeper 的架構(gòu) ····························································.48 【任務(wù)實施】 ·················································································.49 任務(wù) 3 安裝與配置 HBase 集群 ···································································.51 【任務(wù)描述】 ·················································································.51 【任務(wù)要求】 ·················································································.51 【相關(guān)知識】 ·················································································.52 2.3.1 HBase 簡介 ····································································.52 2.3.2 HBase 的核心功能模塊 ·····················································.53 2.3.3 HBase 的讀/寫流程 ··························································.54 【任務(wù)實施】 ·················································································.55 【任務(wù)實訓(xùn)】 ·················································································.58 項目總結(jié) ·································································································.59 課后習(xí)題 ·································································································.59 項目3 使用HBase Shell 構(gòu)建博客數(shù)據(jù)庫系統(tǒng) ······················································.60 【教學(xué)目標(biāo)】 ····························································································.60 【背景描述】 ····························································································.60 任務(wù) 1 設(shè)計 HBase 表 ···············································································.61 【任務(wù)描述】 ·················································································.61 【任務(wù)要求】 ·················································································.61 【相關(guān)知識】 ·················································································.61 3.1.1 HBase 的數(shù)據(jù)模型 ···························································.61 3.1.2 HBase 表的結(jié)構(gòu)設(shè)計原則 ··················································.62 3.1.3 HBase 的檢索方式 ···························································.63 3.1.4 RowKey 設(shè)計原則 ····························································.63 3.1.5 熱點問題 ·······································································.63 3.1.6 列族設(shè)計原則 ·································································.64 【任務(wù)實施】 ·················································································.65 任務(wù) 2 創(chuàng)建 HBase 表 ···············································································.66 【任務(wù)描述】 ·················································································.66 【任務(wù)要求】 ·················································································.66 【相關(guān)知識】 ·················································································.66 3.2.1 命名空間 ·······································································.66 3.2.2 創(chuàng)建表 ··········································································.67 3.2.3 查看表結(jié)構(gòu) ····································································.68 3.2.4 修改表 ··········································································.69 3.2.5 刪除表 ··········································································.69 【任務(wù)實施】 ·················································································.69 【任務(wù)實訓(xùn)】 ·················································································.70 任務(wù) 3 查詢 HBase 表數(shù)據(jù) ·········································································.70 【任務(wù)描述】 ·················································································.70 【任務(wù)要求】 ·················································································.70 【相關(guān)知識】 ·················································································.71 3.3.1 插入數(shù)據(jù) ·······································································.71 3.3.2 查詢數(shù)據(jù) ·······································································.72 3.3.3 掃描全表數(shù)據(jù) ·································································.72 3.3.4 刪除數(shù)據(jù) ·······································································.73 3.3.5 清空數(shù)據(jù) ·······································································.73 【任務(wù)實施】 ·················································································.73 【任務(wù)實訓(xùn)】 ·················································································.75 任務(wù) 4 查詢符合指定條件的 HBase 表數(shù)據(jù) ····················································.76 【任務(wù)描述】 ·················································································.76 【任務(wù)要求】 ·················································································.76 【相關(guān)知識】 ·················································································.77 3.4.1 HBase 高級查詢 ······························································.77 3.4.2 HBase 的抽象操作符 ························································.77 3.4.3 HBase 的比較器 ······························································.77 3.4.4 HBase 的過濾器 ······························································.78 【任務(wù)實施】 ·················································································.79 【任務(wù)實訓(xùn)】 ·················································································.79 項目總結(jié) ·································································································.80 課后習(xí)題 ·································································································.80 項目4 使用HBase Java API 開發(fā)博客數(shù)據(jù)庫系統(tǒng) ·················································.82 【教學(xué)目標(biāo)】 ····························································································.82 【背景描述】 ····························································································.82 任務(wù) 1 搭建 HBase 開發(fā)環(huán)境 ······································································.83 【任務(wù)描述】 ·················································································.83 【任務(wù)要求】 ·················································································.83 【任務(wù)實施】 ·················································································.83 任務(wù) 2 插入并查詢數(shù)據(jù) ·············································································.95 【任務(wù)描述】 ·················································································.95 【任務(wù)要求】 ·················································································.95 【相關(guān)知識】 ·················································································.95 4.2.1 HBase Java API 的主要接口與類 ··········································.95 4.2.2 使用 HBase Java API 創(chuàng)建命名空間和表 ································.98 4.2.3 使用 HBase Java API 插入數(shù)據(jù) ············································.99 4.2.4 使用 HBase Java API 查詢數(shù)據(jù) ···········································.100 4.2.5 使用 HBase Java API 進行全表查詢 ·····································.101 【任務(wù)實施】 ················································································.102 【任務(wù)實訓(xùn)】 ················································································.111 任務(wù) 3 查詢符合指定條件的數(shù)據(jù) ································································.112 【任務(wù)描述】 ················································································.112 【任務(wù)要求】 ················································································.112 【相關(guān)知識】 ················································································.112 4.3.1 Hbase 過濾器API ···························································.112 【任務(wù)實施】 ················································································.114 【任務(wù)實訓(xùn)】 ················································································.120 任務(wù) 4 實現(xiàn) MapReduce 與 HBase 表的集成 ··················································.120 【任務(wù)描述】 ················································································.120 【任務(wù)要求】 ················································································.121 【相關(guān)知識】 ················································································.121 4.4.1 Hadoop 集群運行 MapReduce 程序 ······································.121 4.4.2 將數(shù)據(jù)導(dǎo)入 Hbase 表中 ····················································.122 4.4.3 導(dǎo)出 HBase 表中的數(shù)據(jù) ···················································.123 【任務(wù)實施】 ················································································.123 【任務(wù)實訓(xùn)】 ················································································.128 項目總結(jié) ································································································.129 課后習(xí)題 ································································································.129 項目5 安裝與配置Hive 結(jié)構(gòu)化數(shù)據(jù)倉庫 ···························································.131 【教學(xué)目標(biāo)】 ···························································································.131 【背景描述】 ···························································································.131 任務(wù) 1 安裝與配置 Hive ···········································································.132 【任務(wù)描述】 ················································································.132 【任務(wù)要求】 ················································································.132 【相關(guān)知識】 ················································································.132 5.1.1 Hive 的起源與發(fā)展 ·························································.132 5.1.2 Hive 與傳統(tǒng)數(shù)據(jù)庫的對比 ················································.132 5.1.3 Hive 的系統(tǒng)架構(gòu) ····························································.133 5.1.4 Hive 的工作原理 ····························································.134 5.1.5 安裝前的準備工作 ··························································.135 【任務(wù)實施】 ················································································.135 【任務(wù)實訓(xùn)】 ················································································.142 任務(wù) 2 在 Hive CLI 界面執(zhí)行 Shell 命令和 dfs 命令 ·············································.143 【任務(wù)描述】 ················································································.143 【任務(wù)要求】 ················································································.143 【相關(guān)知識】 ················································································.143 5.2.1 在文件中執(zhí)行 Hive 查詢 ···················································.143 5.2.2 在 Hive 中執(zhí)行 Linux Shell 命令 ·········································.145 5.2.3 在 Hive 中使用 Hadoop 的 dfs 命令 ·····································.146 5.2.4 在 Hive 腳本中進行注釋 ···················································.147 【任務(wù)實施】 ················································································.148 【任務(wù)實訓(xùn)】 ················································································.149 項目總結(jié) ································································································.150 課后習(xí)題 ································································································.150 項目6 使用Hive 定義優(yōu)惠券數(shù)據(jù) ····································································.152 【教學(xué)目標(biāo)】 ···························································································.152 【背景描述】 ···························································································.152 任務(wù) 1 創(chuàng)建 Hive 表 ·················································································.153 【任務(wù)描述】 ················································································.153 【任務(wù)要求】 ················································································.153 【相關(guān)知識】 ················································································.153 6.1.1 Hive 的數(shù)據(jù)類型 ····························································.153 6.1.2 創(chuàng)建與管理數(shù)據(jù)倉庫 ·······················································.154 6.1.3 創(chuàng)建表 ·········································································.156 6.1.4 修改表 ·········································································.160 【任務(wù)實施】 ················································································.161 【任務(wù)實訓(xùn)】 ················································································.162 任務(wù) 2 向Hive 表中導(dǎo)入數(shù)據(jù) ·····································································.163 【任務(wù)描述】 ················································································.163 【任務(wù)要求】 ················································································.163 【相關(guān)知識】 ················································································.163 6.2.1 導(dǎo)入數(shù)據(jù) ······································································.163 6.2.2 導(dǎo)出數(shù)據(jù) ······································································.167 【任務(wù)實施】 ················································································.168 【任務(wù)實訓(xùn)】 ················································································.169 項目總結(jié) ································································································.170 課后習(xí)題 ································································································.170 項目7 使用Hive Shell 實現(xiàn)優(yōu)惠券消費數(shù)據(jù)的分析及處理 ·····································.172 【教學(xué)目標(biāo)】 ···························································································.172 【背景描述】 ···························································································.172 任務(wù) 1 查詢領(lǐng)取了優(yōu)惠券的用戶信息 ··························································.173 【任務(wù)描述】 ················································································.173 【項目要求】 ················································································.173 【相關(guān)知識】 ················································································.173 7.1.1 select 基本查詢 ······························································.173 7.1.2 limit 結(jié)果限制 ································································.175 7.1.3 distinct 去重查詢 ····························································.176 7.1.4 where 條件查詢 ······························································.176 7.1.5 Hive 內(nèi)置運算符 ····························································.177 7.1.6 正則表達式 ···································································.179 【任務(wù)實施】 ················································································.180 【任務(wù)實訓(xùn)】 ················································································.182 任務(wù) 2 構(gòu)建用戶標(biāo)簽列 ············································································.183 【任務(wù)描述】 ················································································.183 【任務(wù)要求】 ················································································.183 【相關(guān)知識】 ················································································.183 7.2.1 case……when……語句的使用 ···········································.183 7.2.2 group by 分組查詢 ··························································.184 7.2.3 having 條件篩選 ·····························································.185 【任務(wù)實施】 ················································································.185 【任務(wù)實訓(xùn)】 ················································································.186 任務(wù) 3 構(gòu)建用戶特征字段 ·········································································.187 【任務(wù)描述】 ················································································.187 【任務(wù)要求】 ················································································.187 【相關(guān)知識】 ················································································.187 7.3.1 Hive 內(nèi)置函數(shù) ·······························································.187 7.3.2 排序查詢 ······································································.193 【任務(wù)實施】 ················································································.193 【任務(wù)實訓(xùn)】 ················································································.197 任務(wù) 4 連接用戶特征字段 ·········································································.198 【任務(wù)描述】 ················································································.198 【任務(wù)要求】 ················································································.198 【相關(guān)知識】 ················································································.198 7.4.1 union 結(jié)果集合并 ···························································.198 7.4.2 join 連接表數(shù)據(jù) ······························································.200 【任務(wù)實施】 ················································································.201 【任務(wù)實訓(xùn)】 ················································································.202 項目總結(jié) ································································································.202 課后習(xí)題 ································································································.203 項目8 使用Hive Java API 開發(fā)優(yōu)惠券消費數(shù)據(jù)分析應(yīng)用 ······································.205 【教學(xué)目標(biāo)】 ···························································································.205 【背景描述】 ···························································································.205 任務(wù) 1 搭建 Hive 開發(fā)環(huán)境 ········································································.206 【任務(wù)描述】 ················································································.206 【任務(wù)要求】 ················································································.206 【任務(wù)實施】 ················································································.206 任務(wù) 2 編寫自定義函數(shù)統(tǒng)計優(yōu)惠券折扣 ·······················································.212 【任務(wù)描述】 ················································································.212 【任務(wù)要求】 ················································································.212 【相關(guān)知識】 ················································································.213 8.2.1 Hive 自定義函數(shù) ····························································.213 8.2.2 UDF 函數(shù) ·····································································.213 8.2.3 UDAF 函數(shù) ···································································.217 8.2.4 UDTF 函數(shù) ···································································.218 【任務(wù)實施】 ················································································.220 【任務(wù)實訓(xùn)】 ················································································.221 任務(wù) 3 構(gòu)建及合并特征字段 ······································································.222 【任務(wù)描述】 ················································································.222 【任務(wù)要求】 ················································································.222 【相關(guān)知識】 ················································································.222 8.3.1 Hive Java API 的主要類 ····················································.222 8.3.2 執(zhí)行 SQL 語句的方法 ······················································.224 【任務(wù)實施】 ················································································.225 【任務(wù)實訓(xùn)】 ················································································.229 項目總結(jié) ································································································.230 課后習(xí)題 ································································································.230 項目9 基于HBase 和Hive 的電信運營商用戶數(shù)據(jù)分析實戰(zhàn) ··································.233 【教學(xué)目標(biāo)】 ···························································································.233 【背景描述】 ···························································································.233 任務(wù) 1 案例背景和需求分析 ······································································.233 【任務(wù)描述】 ················································································.233 【任務(wù)要求】 ················································································.234 【任務(wù)實施】 ················································································.234 任務(wù) 2 數(shù)據(jù)預(yù)處理 ··················································································.236 【任務(wù)描述】 ················································································.236 【任務(wù)要求】 ················································································.236 【任務(wù)實施】 ················································································.236 任務(wù) 3 用戶數(shù)據(jù)的基本查詢 ······································································.240 【任務(wù)描述】 ················································································.240 【任務(wù)要求】 ················································································.240 【任務(wù)實施】 ················································································.240 任務(wù) 4 分析用戶通話情況 ·········································································.241 【任務(wù)描述】 ················································································.241 【任務(wù)要求】 ················································································.242 【任務(wù)實施】 ················································································.242 任務(wù) 5 將 Hive 的數(shù)據(jù)導(dǎo)入 HBase 中 ···························································.244 【任務(wù)描述】 ················································································.244 【任務(wù)要求】 ················································································.244 【任務(wù)實施】 ················································································.244 項目總結(jié) ································································································.253 附錄 大數(shù)據(jù)組件的常用端口及其說明 ·······························································.254 參考文獻 ······································································································.256
你還可能感興趣
我要評論
|