Apache Hadoop 2.3.0 Released!

Hadoop 2.3.0正式发布,引入了HDFS中异构存储层次和支持内存数据缓存等关键特性,同时改进了YARN的资源管理与故障恢复能力。新增功能包括支持不同存储类型在单一集群上的使用,以及通过数据节点内存集中缓存数据集。此外,还实现了HDFS ACL支持、滚动升级、FSImage使用Protobuf等特性,以及YARN的自动故障转移和增强的服务功能。

hadoop-2.3.0 is the first release for the year 2014, and brings a number of enhancements to the core platform, in particular to HDFS.

With this release, there are two significant enhancements to HDFS:

  • Support for Heterogeneous Storage Hierarchy in HDFS (HDFS-2832)
  • In-memory Cache for data resident in HDFS via Datanodes (HDFS-4949)

With support for heterogeneous storage classes in HDFS, we now can take advantage of different storage types on the same Hadoop clusters. Hence, we can now make better cost/benefit tradeoffs with different storage media such as commodity disks, enterprise-grade disks, SSDs, Memory etc. More details on this major enhancement are available here.

Along similar lines, it is now possible to use memory available in the Hadoop cluster to centrally cache and administer data-sets in-memory in the Datanode’s address space. Applications such as MapReduce, Hive, Pig etc. can now request for memory to be cached (for the curios, we use a combination of mmap, mlock to achieve this) and then read it directly off the Datanode’s address space for extremely efficient scans by avoiding disk altogether. As an example, Hive is taking advantage of this feature by implementing an extremely efficient zero-copy read path for ORC files – see HIVE-6347 for details.

In YARN, we are very excited to see that ResourceManager Automatic Failover (YARN-149) is nearly complete; even it isn’t ready for primetime yet. We expect it to land by the next release i.e. hadoop-2.4. Furthermore, a number of key operational enhancements have been driven into YARN such as better logging, error-handling, diagnostics etc.

On the MapReduce side of the house, a key enhancement is MAPREDUCE-4421; with this we now no longer need to install MapReduce binaries on every machine and can just use a MapReduce tarball via the YARN DistributedCache by copying it into HDFS.

Of course, a number of bug-fixes, enhancements etc. have also made it into hadoop-2.3; thereby continuing to improve the core platform. Please see hadoop-2.3.0 release notes for more details.

Looking Ahead to Apache Hadoop 2.4.0

With hadoop-2.3 the community has again delivered major upgrade to the platform. Looking ahead a number of exciting features are shaping up for Apache Hadoop 2.4 such as:

  • Support for ACLs in HDFS (HDFS-4685)
  • Key operability features such as support for Rolling Upgrades in HDFS (HDFS-5535) and  FSImage being enhanced to use ProtoBufs (HDFS-5698).
  • YARN ResourceManager Automatic Failover (YARN-149)
  • YARN Generic Application Timeline (YARN-1530) & History (YARN-321) services to make it significantly easier to develop and manage new frameworks and services in YARN.

Acknowledgements

Many thanks to everyone who contributed to the release, and everyone in the Apache Hadoop community. Just for the reader’s edification it is instructive to note that hadoop-2.3.0 has 560 JIRAs fixed. Of these, 138 are in Hadoop Common, 203 made it to HDFS, 148 are in YARN and 71 went into MapReduce. So, thank you to every single one of the contributors, reviewers and testers!

In particular I’d like to call out the following folks: Arpit Agarwal, Tsz Wo Sze for their work on Heterogeneous Storage; Andrew Wang, Colin McCabe and Chris Nauroth for their efforts on In-Memory Datanode Cache; Jason Lowe for his work on forklifting MapReduce to deploy via the DistributedCache and several folks from Twitter such as Gera Shegalov, Lohit V., Joep R. and others their for a number of unsung, but very key operational enhancements and fixes to YARN.

Ref: http://hortonworks.com/blog/apache-hadoop-2-3-0-released/

内容概要:本文详细介绍了基于Matlab实现的“梯级水光互补系统最大化可消纳电量期望短期优化调度模型”,属于电力系统领域高水平科研成果的复现(EI级别)。该模型聚焦于梯级水电站与光伏发电系统的协同优化调度,通过构建短期优化调度框架,旨在提升可再生能源的电量消纳能力并最大化系统综合效益。研究采用先进的数学优化方法对水光资源进行联合调度,充分考虑了光伏出力的不确定性、水资源约束、系统运行边界条件及电力平衡要求,实现了在多重约束下的电量期望最大化目标。模型不仅具备严谨的理论基础,还具有良好的工程应用前景,适用于新能源高比例渗透背景下电力系统的优化调度研究与实践。; 适合人群:具备电力系统分析、可再生能源利用或优化建模背景的研究生、科研人员及工程技术人员,特别适合致力于复现高水平学术论文(EI/顶刊)研究成果的学习者与开发者。; 使用场景及目标:① 学习并掌握梯级水电与光伏系统协同调度的建模思路与关键技术;② 熟悉基于Matlab的混合整数线性规划(MILP)或其他非线性优化方法在能源系统中的实际应用;③ 提升在新能源消纳、短期调度优化等方向的科研建模能力与代码实现水平,支持二次开发与创新研究。; 阅读建议:建议结合Matlab代码与优化理论同步研读,重点理解目标函数的设计逻辑、各类物理与运行约束的数学表达以及求解器的调用流程,推荐使用YALMIP等建模工具辅助实现,以提高模型构建效率与可读性,便于深入理解与后续拓展。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值