Wednesday, January 30, 2013

Daily Bookmarks 20130130


LanguageManual Transform - Apache Hive - Apache Software Foundation
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Transform
Log Parsing through Hadoop, Hive & Python | Search, Data and Technology
http://www.hiregion.com/2010/02/log-parsing-through-hadoop-hive-python.html



Scheduler | Qubole
http://www.qubole.com/loaders?q=scheduler

Beginning Java - Unit 5 Methods - What is a method?
http://mathbits.com/MathBits/Java/methods/Lesson1.htm

Emacs 入门指引(一) Emacs简介 - gmszone的专栏 - 博客频道 - CSDN.NET
http://blog.csdn.net/gmszone/article/details/7724635


High Scalability - High Scalability - FastBit: An Efficient Compressed Bitmap Index Technology
http://highscalability.com/blog/2009/5/1/fastbit-an-efficient-compressed-bitmap-index-technology.html

Hive 随谈(二)– Hive 结构 - 阿里集团数据平台 alidata.org
http://www.alidata.org/archives/499

Hive 随谈(三)– Hive 和数据库的异同 - 阿里集团数据平台 alidata.org
http://www.alidata.org/archives/551

HBase二级索引方案总结_klose_新浪博客
http://blog.sina.com.cn/s/blog_4a1f59bf01018apd.html





Java EE 6 Platform Highlights - The Java EE 6 Tutorial
http://docs.oracle.com/javaee/6/tutorial/doc/giqvh.html

Initializing Fields (The Java™ Tutorials > Learning the Java Language > Classes and Objects)
http://docs.oracle.com/javase/tutorial/java/javaOO/initial.html




http://docs.oracle.com/javase/tutorial/java/javaOO/initial.html

Tuesday, January 29, 2013

Daily Bookmarks 20130128


一周以来工作总结--关于位图索引 - wingsless - 博客园
http://www.cnblogs.com/wingsless/archive/2012/10/25/2740070.html

多维度分类排行榜应用:用位图索引 -- MySQL -- IT技术博客大学习 -- 共学习 共进步!
http://blogread.cn/it/article/494?f=wb
位图索引_百度百科
http://baike.baidu.com/view/2923346.htm

HBase二级索引与Join -- 算法 -- IT技术博客大学习 -- 共学习 共进步!
http://blogread.cn/it/article/3673?f=sa


淘宝数据魔方技术架构解析 - 阿里集团数据平台 alidata.org
http://www.alidata.org/archives/1789
Finding similar items using minhashing | Hacker News
http://news.ycombinator.com/item?id=2218447

9.7. Regions
http://hbase.apache.org/book/regions.arch.html#compaction
6.2.  On the number of column families
http://hbase.apache.org/book/number.of.cfs.html
在Hbase中选择多少个column family才合适呢? - Mac Track - 博客频道 - CSDN.NET
http://blog.csdn.net/macyang/article/details/6420286





Saturday, January 26, 2013

Daily Bookmarks 20130125

一个人单身久了 | 张衡Henry
http://www.izhangheng.com/only-me/

如何搭建大流量网站的底层系统架构? | 张衡Henry
http://www.izhangheng.com/high-volume-sites-system-architecture/
京东经验之谈:团队开发中常用的系统 | 张衡Henry
http://www.izhangheng.com/system-commonly-team-development/

A recommendation webservice in 10 minutes | “I for one welcome our new computer overlords”
http://ssc.io/a-recommendation-webservice-in-10-minutes/
plista/kornakapi · GitHub
https://github.com/plista/kornakapi
Deploying a massively scalable recommender system with Apache Mahout | “I for one welcome our new computer overlords”
http://ssc.io/deploying-a-massively-scalable-recommender-system-with-apache-mahout/
Flexible Collaborative Filtering In JAVA With Mahout Taste | My Blog by Philippe Adjiman
http://www.philippeadjiman.com/blog/2009/11/11/flexible-collaborative-filtering-in-java-with-mahout-taste/
Mechanistician: Large-Scale Machine Learning with Mahout - Part 1
http://mechanistician.blogspot.tw/2010/07/large-scale-machine-learning-with.html
Mahout Development Environment with Maven and Eclipse (2) | Shuyo's Weblog
http://shuyo.wordpress.com/2011/02/14/mahout-development-environment-with-maven-and-eclipse-2/


Twitter sentiment analysis using Python and NLTK | Laurent Luce's Blog
http://www.laurentluce.com/posts/twitter-sentiment-analysis-using-python-and-nltk/
Basic Sentiment Analysis with Python | fjavieralba.com
http://fjavieralba.com/basic-sentiment-analysis-with-python.html








Thursday, January 24, 2013

Tuesday, January 22, 2013

Daily Bookmarks 20130119

R and Hadoop: Step-by-step tutorials
http://blog.revolutionanalytics.com/2012/03/r-and-hadoop-step-by-step-tutorials.html
Understanding the World using Tables and Graphs « Aurelius
http://thinkaurelius.com/2012/03/22/understanding-the-world-using-tables-and-graphs/
Big data Big Analytics
http://www.slideshare.net/ajayohri/big-data-big-analytics

R语言为Hadoop集群数据统计分析带来革命性变化 - 51CTO.COM
http://database.51cto.com/art/201109/294837.htm

Daily Bookmarks 20130121

《连线》:MapR——Hadoop商业化的典范 - 虚拟化&云计算 - 51开源社区 | 关注开源、Linux and Android - Powered by Discuz!
http://bbs.51osos.com/thread-7088-1-1.html
《连线》:MapR——Hadoop商业化的典范-CSDN.NET
http://www.csdn.net/article/2011-12-26/309694
MapR Technologies声称性能更好_云计算前沿技术-中关村在线
http://cloud.zol.com.cn/272/2721534.html
"它从开源Apache项目获取了该公司所需的组件,同时摈弃了它不喜欢的组件(特别是Hadoop分布式文件系统即HDFS,MapR认为这是单一故障 点,并将它换成了基于Unix的网络文件系统)。Cloudera和Hortonworks的这个竞争对手将其M5商业Hadoop发行版与支持、培训和 咨询等服务(M3发行版是免费的,还与Apache Hadoop百分之百兼容)结合起来。MapR与EMC结为了合作伙伴,EMC采用M5作为其EMC Greenplum HD企业版的基础。"
Platform for Big Data | Talend
http://www.talend.com/products/platform-for-big-data
Talend 与 MapR 联合宣布大数据集成与质量认证 - 开源中国 OSChina.NET
http://www.oschina.net/news/27032/talend-mapr-big-data





Daily Bookmarks 20130122

On Graph Computing « Marko A. Rodriguez
http://markorodriguez.com/2013/01/09/on-graph-computing/

Friday, January 18, 2013

Daily Bookmarks 20130118


Autoboxing and Unboxing (The Java™ Tutorials > Learning the Java Language > Numbers and Strings)
http://docs.oracle.com/javase/tutorial/java/data/autoboxing.html

Generic Methods (The Java™ Tutorials > Learning the Java Language > Generics (Updated))
http://docs.oracle.com/javase/tutorial/java/generics/methods.html
Lesson: Interfaces (The Java™ Tutorials > Collections)
http://docs.oracle.com/javase/tutorial/collections/interfaces/index.html



Generic Types (The Java™ Tutorials > Learning the Java Language > Generics (Updated))
http://docs.oracle.com/javase/tutorial/java/generics/types.html



Thursday, January 17, 2013

Tuesday, January 15, 2013

Daily Bookmarks 20130115


git-svn(1)
http://www.kernel.org/pub/software/scm/git/docs/git-svn.html
使用 git-svn 整合 git 與 svn
http://kanru.info/blog/archives/466/

Software Developer Productivity | All about your Mind | Microsoft Stack - Jay On Software - Daily Routine of a 4 Hour Programmer
http://jayonsoftware.com/home/2012/1/9/daily-routine-of-a-4-hour-programmer.html
My life in Accenture before startups - swombat.com on startups
http://swombat.com/2011/6/7/accenture-before-startups
My life in Accenture before startups | Hacker News
http://news.ycombinator.com/item?id=2636486

洁癖者用 Git:pull --rebase 和 merge --no-ff
http://hungyuhei.github.com/2012/08/07/better-git-commit-graph-using-pull---rebase-and-merge---no-ff/
git-pull-rebase
http://happycasts.net/episodes/10

How to implement RESTful authentication - Synopse
http://blog.synopse.info/post/2011/05/24/How-to-implement-RESTful-authentication
















Friday, January 11, 2013

Daily Bookmarks 20130111

一号门-程序员的工作,程序员的生活(java,python,delphi实战)-
http://www.yihaomen.com/default.asp?cateID=48
django与百度ueditor 集成之二:涂鸦,抓图,搜视频,图片浏览 - 一号门-程序员的工作,程序员的生活(java,python,delphi实战)
http://www.yihaomen.com/article/python/239.htm


New to Data Science
http://www.cloudera.com/content/cloudera/en/developer-community/new-to-data-science.html

New to Hadoop
http://www.cloudera.com/content/cloudera/en/developer-community/new-to-hadoop.html#books

Hadoop Tutorial - Cloudera Support
https://ccp.cloudera.com/display/DOC/Hadoop+Tutorial

Paid > Visibility > google.com | Searchmetrics Essentials
http://suite.searchmetrics.com/en/research/domains/paid/?url=google.com&cc=US

Product overview
http://www.searchmetrics.com/en/products/product-overview/

How to Contribute to Apache Hadoop Projects, in 24 Minutes | Apache Hadoop for the Enterprise | Cloudera
http://blog.cloudera.com/blog/2012/12/how-to-contribute-to-apache-hadoop-projects-in-24-minutes/

Creating a Runnable Binary Distribution With Maven Assembly Plugin
http://www.petrikainulainen.net/programming/tips-and-tricks/creating-a-runnable-binary-distribution-with-maven-assembly-plugin/

初学maven(5)-使用assembly plugin实现自定义打包 - XELONE的专栏 - 博客频道 - CSDN.NET
http://blog.csdn.net/xelone/article/details/5943954


使用 git-svn 整合 git 與 svn
http://kanru.info/blog/archives/466/

toread

Open Source Big Data for the Impatient, Part 1: Hadoop tutorial: Hello World with Java, Pig, Hive, Flume, Fuse, Oozie, and Sqoop with Informix, DB2, and MySQL
http://www.ibm.com/developerworks/data/library/techarticle/dm-1209hadoopbigdata/index.html




Implementing a RESTful Web API with Python & Flask
http://publish.luisrei.com/articles/flaskrest.html





一號
http://publish.luisrei.com/articles/flaskrest.html

Thursday, January 10, 2013

Daily Bookmarks 20130110


Using Queues in Web Crawling and Analysis Infrastructure « Data Big Bang Blog
http://blog.databigbang.com/using-queues-in-web-crawling-and-analysis-infrastructure/


Oracle Berkeley DB 中国研发团队的博客 » 在多个节点上部署Oracle NoSQL数据库
http://www.bdbchina.com/2011/12/%e5%9c%a8%e5%a4%9a%e4%b8%aa%e8%8a%82%e7%82%b9%e4%b8%8a%e9%83%a8%e7%bd%b2oracle-nosql%e6%95%b0%e6%8d%ae%e5%ba%93/

LevelDB: A Fast Persistent Key-Value Store
http://google-opensource.blogspot.tw/2011/07/leveldb-fast-persistent-key-value-store.html

Oracle Berkeley DB 中国研发团队的博客 » Oracle NoSQL应用示例1: 信用卡系统
http://www.bdbchina.com/2012/05/oracle-nosql%e5%ba%94%e7%94%a8%e7%a4%ba%e4%be%8b1%ef%bc%9a-%e4%bf%a1%e7%94%a8%e5%8d%a1%e7%b3%bb%e7%bb%9f/


I Has A Hash Table
http://snej.github.com/2009/09/05/I-Has-A-Hash-Table/
Persisting Native Python Queues « Data Big Bang Blog
http://blog.databigbang.com/persisting-native-python-queues/


Emberjs——了解Emberjs - Liner-z - 博客园
http://www.cnblogs.com/linerz/archive/2012/10/27/emberjs-about.html

Welcome to Tastypie! — Tastypie 0.9.12-alpha documentation
http://django-tastypie.readthedocs.org/en/latest/









Wednesday, January 09, 2013

Daily Bookmarks 20130109


The Orange Sky.: JPA: EclipseLink 2.3.1 的使用與配置方法
http://servbeans.blogspot.tw/2011/11/eclipselink-231.html

自訂 EclipseLink Logger - Programming Design Notes
http://pro.ctlok.com/2011/12/customize-eclipselink-logger.html

HBase Coprocessor的分析 - NoSQLFan - 关注NoSQL相关技术、新闻
http://blog.nosqlfan.com/html/3723.html

Splunk Hadoop Connect | Splunk
http://www.splunk.com/view/hadoop-connect/SP-CAAAHA3

Splunk实现与Hadoop的集成与监控-CSDN.NET
http://www.csdn.net/article/2012-10-25/2811155

盘点2012:云计算的春天-CSDN.NET
http://www.csdn.net/article/2012-12-27/2813223-2012_cloud_computing_review/4

Jonathan Ellis's Programming Blog - Spyced: All you ever wanted to know about writing bloom filters
http://spyced.blogspot.tw/2009/01/all-you-ever-wanted-to-know-about.html

CodingLabs | 基数估计算法概览
http://www.codinglabs.org/html/cardinality-estimation.html

Apache Derby Database - Tutorial
http://www.vogella.com/articles/ApacheDerby/article.html








Tuesday, January 08, 2013

Daily Bookmarks 20130108


Hortonworks宣布一款Hadoop数据平台
http://www.infoq.com/cn/news/2011/11/Hortonworks-Hadoop
Hortonworks Announces Hadoop Data Platform
http://www.infoq.com/news/2011/11/Hortonworks-Hadoop
"出身于名门Yahoo!,Hortonworks拥有着许多Hadoop架构师和源代码贡献者,这些源代码贡献者以前均效力于Yahoo!,而且已经为Apache Hadoop项目贡献了超过80%的源代码,Hortonworks这样说道。这些工程师同时也为分布式领域的一些其他项目(如HCatalog、Ambari和Pig等)做出了贡献,此外,在Yahoo!还都曾参与过在4万台服务器规模集群中运行Hadoop的经验。"


Hortonworks發表支援Hadoop的資料管理軟體 | 即時新聞 | iThome online
http://www.ithome.com.tw/itadm/article.php?c=74347

Hortonworks Announces Hadoop Data Platform
http://www.infoq.com/news/2011/11/Hortonworks-Hadoop



The Hottest Bay Area Startups For Engineers According To LinkedIn - Business Insider
http://www.businessinsider.com/the-hottest-bay-area-startups-for-engineers-according-to-linkedin-2012-5?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+typepad%2Falleyinsider%2Fsilicon_alley_insider+%28Silicon+Alley+Insider%29
Apache Ambari: Hadoop Operations, Innovation, and Enterprise Readiness | Hortonworks
http://hortonworks.com/blog/apache-ambari-hadoop-operations-innovation-and-enterprise-readiness/



網路資訊雜誌 » CloudStack Vs. OpenStack論戰升溫
http://news.networkmagazine.com.tw/classification/software-application/2012/04/19/39157/

微軟巨量資料策略轉向,全面支援Hadoop:: | 新聞專題 | iThome online
http://www.ithome.com.tw/itadm/article.php?c=77576&s=1
"目前微軟已經推出Hadoop on Windows Server以及Hadoop on Windows Azure預覽版本。在微軟的巨量資料處理架構中,一方面是把Hadoop與SQL Server等做一個整合,另一方面則是把Hadoop當作一個服務,透過Windows Azure雲端平臺來提供巨量資料服務。 
下一代Hadoop MapReduce - NoSQLFan - 关注NoSQL相关技术、新闻
http://blog.nosqlfan.com/html/2272.html"

什么是Apache Incubator 以及 Apache CXF的前世今生_plstryagain_新浪博客
http://webcache.googleusercontent.com/search?q=cache:EvAZIYF9AcwJ:blog.sina.com.cn/s/blog_57da019201013b0g.html+&cd=2&hl=zh-TW&ct=clnk&gl=tw&lr=lang_zh-CN%7Clang_zh-TW

Smarter Service Status in Puppet - Dean Wilson@UnixDaemon: In search of (a) life
http://www.unixdaemon.net/tools/puppet/smarter-service-status-in-puppet.html
Actuate与Hortonworks合作 可视化大数据 | civn中文信息可视化社区
http://www.civn.cn/p/4487.html

什么是Apache Incubator 以及 Apache CXF的前世今生_plstryagain_新浪博客
http://webcache.googleusercontent.com/search?q=cache:EvAZIYF9AcwJ:blog.sina.com.cn/s/blog_57da019201013b0g.html+&cd=2&hl=zh-TW&ct=clnk&gl=tw&lr=lang_zh-CN%7Clang_zh-TW
Incubation Policy - Apache Incubator
http://incubator.apache.org/incubation/Incubation_Policy.html#Incubator+Project+Management+Committee+%28PMC%29
雅虎对Apache Hadoop到底做了什么 - 51CTO.COM
http://os.51cto.com/art/201111/304645.htm
Hortonworks CTO Eric Baldeschwiele演讲_网易科技
http://cache.baidu.com/c?m=9f65cb4a8c8507ed4fece76310508037434380143fd3d1027fa3c215cc7958415a65e0ba253f1307cecf061c72aa325eeff234703c055cbd98df883d87fdcd763bcd7a742613d51e428059f4&p=9b73d51bcd934eaf58e8de2d0216d333&newp=8b2a9541949d50b408e2947e085580145c5bc4387ebad7167c96cd5988&user=baidu&fm=sc&query=ambari&qid=&p1=29
盘点2012:云计算的春天-CSDN.NET
http://www.csdn.net/article/2012-12-27/2813223-2012_cloud_computing_review/4
管理Hadoop集群的5大工具
http://www.chinacloud.cn/show.aspx?id=9903&cid=14


Edit this Fiddle - jsFiddle
http://jsfiddle.net/asgallant/HMbYf/
"Parsing the JSON isn't hard - it works just like any other javascript object map.  See an example using your code here: http://jsfiddle.net/asgallant/HMbYf/ "








Monday, January 07, 2013

Daily Bookmarks 20130107


Design Pattern - Factory - Programming Design Notes
http://pro.ctlok.com/blog/2012/04/02/design-pattern-factory.html
A Java Factory Pattern example | Factory Design Pattern in Java | Design patterns tutorials | alvinalexander.com
http://alvinalexander.com/java/java-factory-pattern-example

team:start
http://vis.pku.edu.cn/wiki/team/start
sontek's Humble Abode - Writings from John Anderson
http://sontek.net/blog/detail/turning-vim-into-a-modern-python-ide

REST with Java (JAX-RS) using Jersey - Tutorial
http://www.vogella.com/articles/REST/article.html
[Java] 10分鐘快速上手Spring Framework - Part1 (IoC容器篇) - The blog of typewriter職人- 點部落
http://www.dotblogs.com.tw/shadow/archive/2011/06/17/28784.aspx

maxerize (Maximilian Aigner) flask blog
https://github.com/maxerize
Git权威指南 - 关于本书
http://www.worldhello.net/gotgit/
Dead easy yet powerful static website generator with Flask | Code | Nicolas Perriault
https://nicolas.perriault.net/code/2012/dead-easy-yet-powerful-static-website-generator-with-flask/
n1k0/nicolas.perriault.net · GitHub
https://github.com/n1k0/nicolas.perriault.net
My Git Branching Model
http://williamdurand.fr/2012/01/17/my-git-branching-model/





Saturday, January 05, 2013


IT 台灣郎: [突破辦公室網路] ssh tunnel + Firefox Foxy Proxy
http://taiwanwolf.blogspot.tw/2008/11/ssh-tunnel-firefox-foxy-proxy.html


火狐

IT 台灣郎: [突破辦公室網路] ssh tunnel + Firefox Foxy Proxy
http://taiwanwolf.blogspot.tw/2008/11/ssh-tunnel-firefox-foxy-proxy.html


突破

Thursday, January 03, 2013

Daily Bookmarks 20130103

分布式消息中间件 MetaQ 作者庄晓丹专访 - CSDN 官方博客 - 博客频道 - CSDN.NET http://blog.csdn.net/blogdevteam/article/details/8449916
Th30z (Matteo Bertozzi Code): Improve and Tune your service/app with some statistics
http://th30z.blogspot.tw/2012/04/improve-and-tune-your-serviceapp-with.html
index More code fun
http://code.taobao.org/p/metamorphosis/wiki/index/

intro More code fun
http://code.taobao.org/p/metamorphosis/wiki/intro/
Metamorphosis
http://metaq.taobao.org/

张云健:亿贝自助式数据服务平台eBay Data Marketplace-CSDN.NET
http://www.csdn.net/article/2012-05-25/2805997.html
大数据计数:如何仅用1.5KB内存为十亿对象计数-CSDN.NET
http://www.csdn.net/article/2012-12-21/2813063-big-data-counting-how-to-count-a-objects
Prismatic:用机器学习分析用户兴趣只需10秒钟-CSDN.NET good
http://www.csdn.net/article/2013-01-03/2813185-Prismatic

如何熟悉一个开源项目? - 庄周梦蝶 - BlogJava
http://www.blogjava.net/killme2008/archive/2012/05/22/378885.html
视频站点的搭建 - 庄周梦蝶 - BlogJava
http://www.blogjava.net/killme2008/archive/2007/12/19/168788.html


metamorphosis-1-对比其它消息队列 - j2ee绿洲 - BlogJava
http://www.blogjava.net/livery/articles/391595.html
抽取网页数据的不同思路 - 庄周梦蝶 - BlogJava
http://www.blogjava.net/killme2008/archive/2007/11/22/162338.html


Java泛型再学习 - 庄周梦蝶 - BlogJava
http://www.blogjava.net/killme2008/archive/2007/06/05/122174.html

hbase权威指南阅读随手笔记二之过滤器 - puts "hello saint!" - 博客频道 - CSDN.NET
http://blog.csdn.net/saint1126/article/details/8257941


如何讓每天變成26小時? - Inside 硬塞的網路趨勢觀察
http://www.inside.com.tw/2013/01/03/how-i-made-a-26-hour-day



http://www.inside.com.tw/2013/01/03/how-i-made-a-26-hour-day