Saturday, June 29, 2013

Daily Bookmarks 20130629

HBase在Facebook Message存储的使用经验总结 | Binospace
http://www.binospace.com/index.php/hbase-zai-facebook-message-cun-chu-di-shi-yong-jing-yan-zong-jie/
[HBase]KeyValue and HFile create - 吊丝码农 - ITeye技术网站
http://iwinit.iteye.com/blog/1827527
Th30z (Matteo Bertozzi Code): HBase I/O: HFile
http://th30z.blogspot.tw/2011/02/hbase-io-hfile.html
通过解析Hfile的index结构获取数据分布情况_Hadoop与分布式数据处理_ITPUB论坛-it168旗下专业技术社区
http://www.itpub.net/thread-1625291-1-1.html

Using HFile outside HBase at HUGUK #7 | Lanyrd
http://lanyrd.com/2010/huguk7/sxbw/


快速URL排重的方法
http://www.360doc.com/content/08/1031/15/3500_1855560.shtml
开源网络爬虫介绍及其比较 - Bill's Blog
http://ibillxia.github.io/blog/2010/08/20/several-open-source-web-crawlers-comparing/
网络爬虫设计—url排重算法布隆过滤器 (Bloom Filter) 详解 02_cphmvp
http://cphmvp.diandian.com/post/2013-01-17/40046782422
一种分布式网络爬虫的URL排重系统及方法 - IP.com
http://ip.com/patfam/zh/47647145

静态cache之log共现词分析 « 搜索技术博客-淘宝
http://www.searchtb.com/2013/06/%e9%9d%99%e6%80%81cache%e4%b9%8blog%e5%85%b1%e7%8e%b0%e8%af%8d%e5%88%86%e6%9e%90.html?spm=0.0.0.0.efcrfI
从狄仁杰的测字占卜到一淘网的Query分析之大结局 « 搜索技术博客-淘宝
http://www.searchtb.com/2011/01/from-augur-to-etao-query-analysis.html?spm=0.0.0.0.iMCbQH
从狄仁杰的测字占卜到一淘网的Query分析 « 搜索技术博客-淘宝
http://www.searchtb.com/2010/11/%e4%bb%8e%e7%8b%84%e4%bb%81%e6%9d%b0%e7%9a%84%e6%b5%8b%e5%ad%97%e5%8d%a0%e5%8d%9c%e5%88%b0%e4%b8%80%e6%b7%98%e7%bd%91%e7%9a%84query%e5%88%86%e6%9e%90.html?spm=0.0.0.0.iMCbQH




















Friday, June 28, 2013

Daily Bookmarks 20130628

Building Web Apps in WebView | Android Developers
http://developer.android.com/guide/webapps/webview.html
Android編程: 一個簡單的瀏覽器, 網絡視圖(WebView).
http://androidbiancheng.blogspot.tw/2010/01/webview.html
Weakapp's Memo: 怎麼使用 android 的 webview
http://weakapp0320.blogspot.tw/2013/04/android-webview-1.html
[Android] WebView 傳值給 HTML - No 1105- 點部落
http://www.dotblogs.com.tw/joe11051105/archive/2013/04/14/101573.aspx
android开发中WebView的使用(附完整程序) | 应用开发笔记
http://www.pocketdigi.com/20110216/176.html

Canned Platypus : Availability and Partition Tolerance
http://pl.atyp.us/wordpress/?p=2521
谈正确理解 CAP 理论
http://www.douban.com/group/topic/11765014/
关于CAP - 一个故事@MySQL DBA
http://www.orczhou.com/index.php/2010/05/all-about-cap-i-learn/
Brewer's CAP Theorem
http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
Consistency | Xexex's Java 和其他二三事
http://www.javaworld.com.tw/roller/ingramchen/entry/consistency
为什么不能牺牲Partition tolerance? 俺同时白话一下partition._新浪轻博客
http://qing.blog.sina.com.cn/tj/709d1dde33000bb7.html
Availability and Partition Tolerance - 搞计算机的的日志 - 网易博客
http://hhw3.blog.163.com/blog/static/2690966201191442724418/
如何“打败”CAP定理
http://www.programmer.com.cn/9260/

【转】关键字提取算法之TF-IDF扫盲 - 码农.KEN - 博客园
http://www.cnblogs.com/ken-zhang/archive/2010/06/20/1761108.html
【分享】利用decorator实现Django表单防重复提交 - 码农.KEN - 博客园
http://www.cnblogs.com/ken-zhang/archive/2010/12/25/1916437.html




















Friday, June 21, 2013

Taiwan Hadoop Forum • 檢視主題 - 關於hadoop client
http://forum.hadoop.tw/viewtopic.php?t=18
Securing an Apache Hadoop Cluster Through a Gateway | Apache Hadoop for the Enterprise | Cloudera
http://blog.cloudera.com/blog/2008/12/securing-a-hadoop-cluster-through-a-gateway/

solr 初體驗 @ 不大會寫程式 :: 隨意窩 Xuite日誌
http://blog.xuite.net/misgarlic/weblogic/30448629-solr+%E5%88%9D%E9%AB%94%E9%A9%97
詳全文_全文檢索伺服器Solr初探
http://newsletter.ascc.sinica.edu.tw/news/read_news.php?nid=2288


hadoop SecondNamenode详解-qhw-ChinaUnix博客
http://blog.chinaunix.net/uid-20577907-id-3524135.html

Thursday, June 20, 2013

Daily Bookmarks 20130620

Esse, of Something: n-gram,語言,與其他符號 http://esse_tsyo.blogspot.tw/2010/10/n-gram.html
google n-gram
http://googleresearch.blogspot.tw/2006/08/all-our-n-gram-are-belong-to-you.html

2.4. Example Configurations
http://hbase.apache.org/book/example_config.html
HBase入门笔记(四)--完全分布式HBase集群安装配置 - 林场 - 博客园
http://www.cnblogs.com/ventlam/archive/2011/01/22/HBaseCluster.html
HBase导入导出 - _Deron_ - 博客园
http://www.cnblogs.com/Deron/archive/2013/03/31/2981934.html

编写MR运行在Hbase上面注意事项 - 分布式应用与服务器架构专栏 - 博客频道 - CSDN.NET
http://blog.csdn.net/chenyi8888/article/details/8646659

监控网-提供网站监控和服务器远程监控系统以及snmp、nginx、mysql、邮件服务监控的网站
http://www.jiankong.cn/

淘宝核心系统团队博客 | Beanstalkd 一个高性能分布式内存队列系统
http://rdc.taobao.com/blog/cs/?p=1201

fxsjy/miniseg
https://github.com/fxsjy/miniseg
鹰之瞳---网络自动运维系统---
https://www.yingzhitong.com/accounts/login/?next=/state/

dfs.datanode.failed.volumes.tolerated - Google 搜尋
https://www.google.com.tw/search?q=dfs.datanode.failed.volumes.tolerated&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:zh-TW:official&client=firefox-a
Hadoop 參數設定 – hdfs-site.xml « Fenriswolf 程式筆記
http://fenriswolf.me/2012/05/25/hadoop-%E5%8F%83%E6%95%B8%E8%A8%AD%E5%AE%9A-hdfs-site-xml/
hadoop配置含义(继续更新中) - xiao晓 - 博客园
http://www.cnblogs.com/serendipity/archive/2011/08/23/2151031.html


Thursday, June 06, 2013

Daily Bookmarks 20130606

Terminal Recording with script and scriptreplay command
http://sharadchhetri.com/2012/07/16/terminal-recording-script-scriptreplay-command/
Virtual Vocaloid Manager: 如何紀錄linux終端的操作日誌
http://vocaloidmanager.blogspot.tw/2013/01/linux.html
How to use script and scriptreplay | OracleOnLinux
http://www.oracleonlinux.cn/2010/04/how-to-use-script-and-scriptreplay/
chunzi-blog-simple/chunzi-blog-posts/1171441019.html at master · chunzi/chunzi-blog-simple · GitHub
https://github.com/chunzi/chunzi-blog-simple/blob/master/chunzi-blog-posts/1171441019.html

How to Traverse a Directory Tree in Python - Guide to os.walk | Python Central
http://pythoncentral.org/how-to-traverse-a-directory-tree-in-python-guide-to-os-walk/
Python program to traverse directories and read file information - Stack Overflow
http://stackoverflow.com/questions/5421599/python-program-to-traverse-directories-and-read-file-information
filesystems - Directory listing in Python - Stack Overflow
http://stackoverflow.com/questions/120656/directory-listing-in-python

time - Timestamp Python - Stack Overflow
http://stackoverflow.com/questions/13890935/timestamp-python

大数据?别唬人了!我们真的需要盲目烧钱追求大数据吗?-CSDN.NET
http://www.csdn.net/article/2013-05-14/2815268-most-data-isnt-big

Daily Bookmarks 20130530

CloudFront: Salesforce.com's Phoenix : SQL layer for your Hbase
http://cloudfront.blogspot.tw/2013/01/salesforcecoms-phoenix-sql-layer-for.html

Hadoop Hive 中的排序 Order by ,Sort by ,Distribute by, Cluster By, - - ITeye技术网站
http://metooxi.iteye.com/blog/1447621
alo.alt: Using Hive's HBase handler
http://mapredit.blogspot.tw/2012/12/using-hives-hbase-handler.html

hbase shell基础和常用命令详解三江小渡 | 三江小渡
http://blog.pureisle.net/archives/1887.html


Daily Bookmarks 20130605

Writing shell scripts - Lesson 15: Errors and Signals and Traps (Oh My!) - Part 1
http://linuxcommand.org/wss0150.php


Tuesday, June 04, 2013

Daily Bookmarks 20130604

fcamel 技術隨手記: shell script 處理含空白字元的檔名
http://fcamel-life.blogspot.tw/2011/08/shell-script.html
LinuxCommand.org: Tips, News And Rants: Using Configuration Files With Shell Scripts
http://lcorg.blogspot.tw/2010/06/using-configuration-files-with-shell.html


hadoop - Hive multiple insert goes wrong with the DISTINCT select statement - Stack Overflow
http://stackoverflow.com/questions/15173608/hive-multiple-insert-goes-wrong-with-the-distinct-select-statement

Ankit Jain's blog: Sqoop export and import commands
http://ankitasblogger.blogspot.tw/2012/01/sqoop-export-and-import-commands.html
Welcome to Kitsune’s documentation! — Kitsune master documentation
http://kitsune.readthedocs.org/en/latest/index.html#

HBase - Who needs a Master? : Apache HBase
https://blogs.apache.org/hbase/entry/hbase_who_needs_a_master
Hadoop HBase user's mailing list ()
http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/34592
database - How Row Key is designed in Hbase - Stack Overflow
http://stackoverflow.com/questions/16356491/how-row-key-is-designed-in-hbase
HBaseWD: Avoid RegionServer Hotspotting Despite Sequential Keys | Sematext Blog
http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
hbase介绍 - 阿里集团数据平台 alidata.org
http://www.alidata.org/archives/1509

Apache HBase Region Splitting and Merging | Hortonworks
http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/
Best Practices for Managing HBase in a High Write Environment | The AppFirst Blog
http://www.appfirst.com/blog/best-practices-for-managing-hbase-in-a-high-write-environment/
split - in HBase what will happen if a single row size exceeds region max size? - Stack Overflow
http://stackoverflow.com/questions/15828310/in-hbase-what-will-happen-if-a-single-row-size-exceeds-region-max-size
HBase一些tip - Change Dir - BlogJava good
http://www.blogjava.net/changedi/archive/2012/12/28/393577.html
HBase的数据的update - 天行健 - ITeye技术网站
http://punishzhou.iteye.com/blog/1266341
HBase的get过程(一) - 天行健 - ITeye技术网站
http://punishzhou.iteye.com/blog/1258848



Lucene in 5 minutes - Lucene Tutorial.com
http://www.lucenetutorial.com/lucene-in-5-minutes.html
Salmon Run: Writing Lucene Records to SequenceFiles on HDFS
http://sujitpal.blogspot.tw/2012/03/writing-lucene-records-to-sequencefiles.html
hadoop - opening lucene index stored in hdfs - Stack Overflow
http://stackoverflow.com/questions/2763112/opening-lucene-index-stored-in-hdfs
Indexing and Searching on a Hadoop Distributed File System | Dr Dobb's
http://www.drdobbs.com/parallel/indexing-and-searching-on-a-hadoop-distr/226300241?pgno=1
Salmon Run: Writing Lucene Records to SequenceFiles on HDFS
http://sujitpal.blogspot.tw/2012/03/writing-lucene-records-to-sequencefiles.html