Use Rational Data Architect to integrate data sources

本文介绍使用 IBM Rational Data Architect 实现数据源联邦设计的五步骤流程,涵盖现有基础设施注释、数据源映射、创建联邦模型、映射联邦数据源及生成联邦代码。
No doubt about it -- information integration is challenging. Many business decisions must be documented and many transformations must be performed. IBM Rational® Data Architect can document your decisions and automate part of this process. Read this article to explore a tool-supported process for federation design in just five steps.
Show developerWorks content related to my search: Rational Data ArchitectShow developerWorks content related to my search: Rational Data Architect
rel="stylesheet" type="text/css" href="http://www.ibm.com/developerworks/library/ar-rdaint/krugleStyles.css" />

Introduction

When attempting to integrate data sources, you need to consider many activities. Rational Data Architect can help document decisions and automate parts of your tasks. In this article, you are introduced to a process you can use and modify for your specific data integration needs. The five steps to a successful design, covered in this article, are:

  1. Annotating existing infrastructure
  2. Mapping data sources to each other
  3. Creating a federated model
  4. Mapping federation data sources
  5. Generating federation code
Back to top

Rational Data Architect product overview

Rational Data Architect is a data modeling and integration design tool designed to help data architects understand information assets, their relationships and dependencies, map assets to each other and create integration schemas. Architected for teams of any size, Rational Data Architect combines data modeling with mapping discovery and model and database analysis -- all in a single, integrated tool. In addition, Rational Data Architect supports enterprise standards enforcement. Rational Data Architect uses a heterogeneous approach that facilitates federation design and is an essential tool for information integration projects.

Rational Data Architect provides tools that can dramatically reduce design and development hours. This new software, built on the open source Eclipse platform, helps data architects model, discover, map, and analyze data across multiple information sources, automating information integration in complex environments.

Annotating existing infrastructure

The first step of the process helps users assess their current situation. Although this phase involves some automated steps, such as reverse engineering, most of this process is done manually because every annotation is basically a high-probability guess. It is essential to have the participation of the original designers of the data source, and users of the data source.

To annotate your existing infrastructure:

  1. Connect to the existing data source.

    To be able to access the data structure, you need to follow standard connectivity protocols. You need to know the type of the data source, the driver used to connect to it, and the login information (in most cases, login and password). Rational Data Architect uses standard JDBC connectivity to connect to the data source. All further communication with the data source is performed using native queries to the system tables of the data source.

  2. Select the subset of available data structures from the data source.

    Many data sources include data that is irrelevant for understanding stored information, such as counters, temporary helper tables used to sort data, and multilingual text for the user interface. It is much easier to eliminate such data structures at the beginning of the process.

    Rational Data Architect allows filtering at any level of the data structure in the Database Explorer, shown in Figure 1. We will define a filter that will leave only relevant information.

    Figure 1. Connecting to the data source Connecting to the data source
  3. Create a model from the selected subset.

    There are two main reasons to create a model from the data source:

    • Most databases are not able to capture business relevant annotations and documentation at the level of detail needed for a successful integration process.

    • Change management. Integrations have to be designed on a stable structure of data. If the of structure of the data source changes, you need to implement an update for the integration, resulting in a new version of the model.
    A physical data model that can be created from the data source is basically an abstracted copy of the data structure from the data source. See Figure 2. Figure 2. Creating physical data model Creating physical data model
  4. Document data structures in the model.

    While a model displays most of the level of detail of specification from the data source, this is not enough for our understanding of the data. For example, CLNR specified as CHAR(16) is not something that every developer would interpret in exactly the same way. In this activity, you add documentation to every element in the model, including every column, every table, every constraint, and every trigger. You should also specify business-relevant names, to allow faster readability of the model.

    It's also strongly recommend that you create context-relevant diagrams. However, this does not mean you should create a huge diagram gathered from the walls of many meeting rooms. Instead, create small diagrams with approximately seven essential elements. (You can have less, but avoid more, if possible.)

  5. Create or verify a glossary related to the model.

    Used with activity 4, you can start creating a glossary that defines the meaning of names in the data source. Designers and developers have always sought to use names that make their jobs easier. Even when severe constraints on the length of names, naming standards were used for simplification reasons. Consistency depended on the discipline and life cycle of each data source.

    You can refer to a glossary in Rational Data Architect, which includes a list of valid business names with possible abbreviations, shown in Figure 3. For example, the abbreviation CL could stand for client and the abbreviation NR could mean number. Some data sources could have even more extreme, non-intuitive abbreviations, such as J9 to mean client or O1 to indicate identifier. Rational Data Architect does not limit the number of glossaries that can be used at the same time, although I personally recommend that you use only one glossary per model. (This is, by the way, not a technical recommendation, but a user-experience based recommendation.)

    Figure 3. Defining the glossary Defining the glossary

These five activities to annotate your current situation may seem short, but most are very time intense and include a lot of manual work.

Back to top

Mapping data sources to each other

The integration process typically includes integrating from more than one data source and each data source needs to be annotated before you can proceed. After annotating the existing infrastructure, you understand each data source separately, but are still unclear about the overlapping and related information from all data sources.

Mapping existing data sources is optional, because it does not produce results that are required to further automate the process. However, it's highly recommend that you do the mapping, to increase your understanding of the completeness of data for integration, and to foresee possible collisions of data between different data sources.

To map data sources to each other:

  1. Create a new mapping model between each pair of data source models.

    A mapping is a dependency between two data structures that is not implemented in the realization of the data source. A mapping model is a summary of mappings between two independent data sources or data models. The number of mapping models rapidly increases with the number of data sources. You could have one mapping model for two sources, three mapping models for three sources, six mapping models for four sources -- all counting just one direction of models. If you are working with many data sources, you don't typically have to create all of the models. Instead, you can use some of them as references and create mapping models only to those models, as shown in Figure 4.

    Figure 4. Map data source models Map data source models
  2. Discover (automatically or manually) mappings between the data source structures.

    Remember the glossary created in the previous section? At this point, the glossary can help you automate an activity. Mapping discovery can use glossaries to create better suggestions for possible mappings. Each mapping expresses the rule of creation of target structure from the source structure. For example, suppose you have a mapping between driver's license as a target and birth certificates as a source. A mapping to the "name" on the drivers license would be a concatenation of the "first name," "middle name," and "last name" from the birth certificate. This is an example of a mapping that includes transformation. Models typically have hundreds of such elements. It is possible to define all of the mappings manually, but it would take weeks of work.

    Rational Data Architect can help you identify the simplest of all mappings, which realistically represent the vast majority: the one-to-one mappings. Those are mappings from "family name" to "surname," for example. In the first version of Rational Data Architect, mapping discovery can use a combination of up to five discovery algorithms.

    The simplest mapping compares the names of model elements, and optionally uses glossary models to increase the precision of results by expanding abbreviations into business names before comparison. More complex mapping discovery uses externally purchased thesauruses to find synonyms or even data samples from the data source to validate possible mappings. The discovery of mappings has to be done for each mapping model and should be accompanied by documentation of individual mappings for easier readability of the model.

  3. Complete annotations of data source models.

    You can gain additional understanding of data source models from mapping models. For example, you might discover that some data structure in the first data source is related to a data structure in another data source. It could also be an invalidation notice specifying that part of data should not be considered in the integration process because it is inaccurate. It is extremely valuable to complete the mapping between existing data sources, even if you do not intend to integrate information.

The results of the mappings should be explored from two perspectives:

  • Competing data from different models. Competing data could result in more complex integration specification that either prioritizes data from one data source from the other or includes the most recent data.
  • Exclusivity of data structures. These structures should be examined to determine whether it's necessary to include them in the federated model.

Both examinations result in business decisions and are dependent on your reasons for information integration.

Back to top

Creating a federated model

Gaining a good understanding of data sources is essential to validate whether you can complete the process of information integration. A main component of this process is specifying the target, or the schema, that will be visible after the integration. This step should unify the business demand that requires integration with the possibilities of your existing information.

  1. Create a business (logical) model aimed at the solution.

    A business model defines entities and relationships between entities, without consideration of the implementation platform. The model has to solve the business problem. If the business problem requires just a summary of all account standings, for example, then you don't need to include order details in the model.

    Rational Data Architect implements this view as a logical data model, as shown in Figure 5.

    Figure 5. Logical data model Logical data model

    A logical data model is not constrained regarding possible relationships between different entities. It can contain any kind of relationship, including subtyping and many-to-many relationships. During the design process of the logical model, the ongoing validation with business stewards, the owners of the business process, is extremely important. Only they can recognize if something is missing or if the model is not correct regarding relationships and rules.

    To make the model even more understandable, you should create as many diagrams as required to express different business views. Documentation and annotation are the most important parts of models. Imagine how it would feel if someone gave you a model to read without a single line of additional documentation -- the model would lose some of it meaning and you could end up considering it nothing more than a nice drawing.

  2. Turn the logical model into a physical implementation model.

    The logical model expresses the business view of information. The next activity is to turn this model into a physical model that is constrained by the technology we'll use to realize it. This process is relatively straightforward for the first transformation and requires care during version upgrades of models.

    Rational Data Architect allows you to transform a logical model to a physical model. During the transformation, Rational Data Architect automatically resolves all constraints of the target model, such as lack of many-to-many relationships or subtyping, and implements them correctly for the selected target. Rational Data Architect also lets you compare a logical model to the physical model, and update a physical model from this comparison, using the Compare & Synchronize function.

The resulting physical model is not the model that will actually be implemented as a schema in WebSphere Information Integrator; it is a prototype of the integration model, which will be created during the code generation and will replace tables with corresponding nicknames and views.

Back to top

Mapping federation data sources

The fourth major step in this information integration design is to create the mapping between original data sources represented by physical models and the target federation model, also represented by a physical model. This mapping has to be complete and executable to be able to generate code.

The activities in this mapping are very similar to Mapping data sources to each other, with only a few alterations.

  1. Create a new mapping model between each data source model and the federated model.

    This step results in exactly the same number of models as the number of data sources. The summary of all of those models will define how to create the complete federated schema from existing data sources. There will very likely be competing specifications for an element in different data models. We don't address them in this activity, but will eliminate them later on.

  2. Discover (automatically or manually) mappings between the data source structures.

    As in the previous case, you need to discover mappings between source and federated schemas, as shown in Figure 6. This activity is almost identical to the activity between different source schemas discussed earlier. You need to take care of more complicated cases that span more than one table structure on the source by using mapping groups. A mapping group is comparable to a result set that you get with one selection of data from the source to receive federated data (or one "select" statement).

    You can use the alternative view of mapping groups in Rational Data Architect to evaluate and define joins of any complexity. If joins already exist in the source model, it will be automatically suggested in the mapping editor.

    Figure 6. Mapping discovery Mapping discovery
  3. Complete transformations for mappings of data source models.

    To use mappings to create federation code, you need to define executable transformations. Whenever there is a need for a change of format, content, or structure of data, you need to specify how this will be performed. This requires transformation code that is known to the server -- in this case, WebSphere Information Integrator.

    Use the expression builder or enter the transformation directly in the expression property of a mapping in Rational Data Architect. Expression builder already offers a selection of WebSphere Information Integrator predefined functions that can be used.

Next you need to define all necessary transformations from the source to the federation schema. There is just one problem: there might be too many transformations. Because independent mapping editors were used, you don't have any control over the number of mappings that are defined for each element (column) on the target. This is something that you should resolve if you want to generate code.

Back to top

Generating federation code

The final step is the transformation from models back to executable code. You'll do this from the mapping model. But how can you make sure that you generate the right code?

To receive valid code for information integration from all data sources:

  1. Combine all mapping models into one.

    First, you need to get an overview of everything we defined as mapping from any of our data sources to the federated model. You can do this if you overlay all of the source models on one side, and leave the single federated model as the target on the other side. This step results in a very busy model with a lot of mappings, which should not be a big concern, because you'll eliminate many of them in the next step.

    Rational Data Architect lets you combine two mapping models into one in several ways. The one we'll use combines two models with identical targets. We will repeat this until all of the models are joined into one.

    Another possible way you could combine two mapping models is when the target of one is identical to the source of the other model.

  2. Eliminate competing mappings.

    This activity is essential if you want to receive a single executable model. The result needs to be a single executable mapping for each of the target elements (columns). Combining all mapping models created many elements that are targets for more than one mapping. We will look at such elements and select one single mapping. All other mappings need to be removed.

    Alternatively, you could also delete a mapping group if you decide that the whole mapping group (the join) should not be used.

    You also need to delete all mapping groups that are empty. You can easily do this by selecting the mapping group details view in the resulting mapping model.

  3. Generate target schema from mapping model.

    From the model, you can generate the DDL, though we have to be careful. Remember that every physical model knows about the target capabilities. You need to select a model generated from WebSphere Information Integrator to receive federation code with nicknames and generated views.

    While in the code generation wizard from the mapping model, Rational Data Architect allows for changes to the names for any generated element, as shown in Figure 7. The result of the code generation is a schema with all elements in the target integration model, as well as a script for code generation.

    Figure 7. Generate integrated schema Generate integrated schema
  4. Execute schema DDL with WebSphere Information Integrator.

    It's rewarding to see the generated script and to know it's available for changes. I recommend generating code from the model itself because you can compare it with the target and generate code selectively.

    When generating code, you'll use a connection to WebSphere Information Integrator -- the same as used to reverse engineer initial models.

And now you've finished the design process. At this point, it's time to think about test and deployment.

Back to top

Summary

This article described a five-step process for federation design that will produce a federated schema. You also end up with a set of intermediate models that are completely reusable, and will shorten the process next time. This process also helps increase your understanding of the overall information infrastructure.

Rational Data Architect was created to help you with your information integration. I invite you to explore more about it using the download in Resources.

Resources

Learn Get products and technologies Discuss
源码链接: https://pan.quark.cn/s/a4b39357ea24 斐讯K2是一款广受用户青睐的无线路由器,其运行表现稳定且具备较高的可操作性,在DIY爱好者群体中拥有极高的声誉。本资料将系统性地阐述斐讯K2的固件刷机方法及其关联的技术要点。固件升级是路由器爱好者改善设备性能、扩展功能的一种普遍手段,经由替换出厂固件,能够达成更加个性化的网络配置、增强安全防护等目标。斐讯K2固件资源库涵盖了多种知名的非官方固件,诸如Tomato Pheonix 不死鸟、高恪、PandoraBox 潘多拉等,这些固件均具备独特的优势,能够适配不同用户的需求。 1. Tomato Pheonix 不死鸟:Tomato是一款立足于Linux的开源固件,以其精巧、高效而备受推崇。不死鸟版本是专门为华硕及斐讯路由器优化的分支,提供了卓越的QoS(服务质量)配置、详尽的图表监控以及便捷的固件升级途径。对于那些需要精准调控带宽和监测网络状态的用户而言,这是一个理想的选项。 2. 高恪:高恪固件是OpenWrt的定制化版本,着重于操作的便捷性和运行的可靠性,特别适合对路由器操作不甚熟悉的用户群体。它提供了一些实用的功能,例如内置的广告屏蔽、快速测速工具等,同时保留了OpenWrt的适应性。 3. PandoraBox 潘多拉:潘多拉盒是另一款基于OpenWrt的固件,它以丰富的插件库和强大的自定义潜力而闻名。用户能够依据个人需求安装各类插件,实现更多功能,如远程接入、DDNS(动态域名解析服务)等。 4. 官方固件的纯净版本与定制版本:官方固件通常更侧重于稳定性,纯净版意味着未预置额外的应用或服务,适合注重稳定性的用户。定制版则可能包含了制造商的特色功能或优...
源码下载地址: https://pan.quark.cn/s/926926948560 AS3.0与XML结合的通用图片滚动功能,是一种基于ActionScript 3.0和XML技术的动态图像展示方案,非常适合初学者进行学习和实践应用。此项目的关键在于借助XML文件作为数据媒介,用来保存图像的相关参数,例如图像的链接地址、展示的次序等,接着在AS3.0环境中对XML进行解析,并动态地载入和展示这些图像,达成图像的滚动或是循环播放的目的。 我们需要明确ActionScript 3.0(AS3.0)是Adobe Flash Professional以及Flex Builder等开发工具中采用的编程语言,用于构建交互式内容以及丰富的互联网应用。相较于先前的版本,AS3.0在性能上有了大幅度的提升,并且引入了更为规范的面向对象编程模式,涵盖了类、接口以及包等概念。 XML(可扩展标记语言)是一种简明且高效的数据传输格式,既便于人类阅读和编写,也易于机器进行解析和生成。在该项目中,XML文件用于存储图像数据,例如图像的URL、延时的时长、动画的样式等,通过这种方式可以将数据与程序代码分离,从而增强代码的可维护性与可扩展程度。 实施这一图片滚动功能,主要涉及到以下AS3.0的核心知识点: 1. **XML解析**:运用`XML`类来载入并解析XML文件,从而获取图像的清单。AS3.0提供了简便的API来操作XML节点,例如`children()`、`attributes()`等,用以获取子节点和属性值。 2. **事件监听**:借助`EventDispatcher`类来监控载入和解析过程中的事件,比如`Event.OPEN`、`Event.PROGRESS`、`Event...
内容概要:本文介绍了软件许可管理的技术实现方式及相关工具资源,重点阐述了加密外壳(EMS)和API加密两种保护机制。加密外壳通过将程序(如.exe、.dll、.apk)封装在加密壳中,实现运行时内存解密,防止静态反编译和代码篡改,同时支持对数据文件、系统参数及部分代码的加密,并依赖硬件锁(HL)或软件锁(SL)进行授权控制。API加密则通过在代码中嵌入安全验证调用,确保授权合法后才执行核心逻辑。文章还说明了锁的类型(HL/SL)、模式(有驱/AdminMode与无驱/UserMode)、升级路径以及虚拟时钟功能,并描述了产品授权流程从功能定义到产品创建、授权生成的全过程,支持通过C2V文件或锁ID复制已有授权状态。文中附带多个开源平台链接和技术博客参考资源。; 适合人群:从事软件版权保护、授权系统开发或安全技术研究的研发人员,尤其是具备一定逆向工程、软件安全基础的1-3年经验开发者。; 使用场景及目标:①构建安全的软件授权体系,防止盗版和非法使用;②实现灵活的功能授权管理(如时效、并发、硬件绑定);③选择合适的加密方案(硬件锁/软锁、有驱/无驱)并集成到现有产品中;④学习加密外壳与API验证的实际应用方法; 阅读建议:此资源侧重于软件许可的技术架构与实施细节,建议结合提供的GitHub、Gitee项目链接及CSDN技术文章深入理解实现原理,并通过实际调试加密壳和模拟授权流程加强实践能力。
内容概要:本文聚焦于“风光制氢合成氨系统优化研究”,系统阐述了基于Cplex求解器对该耦合系统进行数学建模与优化求解的全过程,并提供了完整的Matlab代码实现。研究整合风能、光伏等可再生能源发电与电解水制氢、合成氨化工工艺,构建涵盖系统容量配置与运行调度的联合优化模型,旨在提升绿电就地消纳水平、降低碳排放强度并实现综合能源利用效率的最大化。文中详细解析了优化模型的核心构成,包括以综合成本最小化或能源效率最大化为目标的目标函数设计,以及涵盖设备出力能力、系统能量动态平衡、设备启停特性等关键环节的约束条件建模方法,利用Cplex求解器进行高效精确求解,模型适用于并网与离网等多种运行场景。; 适合人群:具备一定能源系统建模与优化理论基础,熟练掌握Matlab编程语言及常用优化工具箱(如YALMIP)应用的科研人员与工程技术从业者,特别适用于从事综合能源系统规划、绿色氢能与绿氨生产、可再生能源高效集成等前沿领域的硕士、博士研究生及高校科研人员。; 使用场景及目标:①复现高水平学术论文中关于风光制氢合成氨系统的复杂优化模型;②深入掌握Cplex求解器在大规模、多约束能源系统优化问题中的高级建模与调用技巧;③开展面向“双碳”战略的绿氢、绿氨生产项目的可行性分析、规划设计与运行策略研究,为清洁能源项目的科学决策与工程落地提供量化依据和技术支撑。; 阅读建议:建议读者结合文中提供的Matlab代码与相关领域的权威文献进行对照学习,重点剖析模型构建的物理逻辑与数学推导过程,熟练掌握Cplex与Matlab的接口调用方法;鼓励读者通过调整系统参数、修改目标函数或扩展模型结构(如引入更多不确定性因素)等方式进行二次开发,以适应不同的实际应用场景,进一步深化对综合能源系统优化的理解与实践能力。
打开链接下载源码: https://pan.quark.cn/s/a4b39357ea24 本资源汇编了数据结构实验的上机任务解答,涵盖了代码实现以及详尽的注释说明。以下是对相关知识的梳理: 1. 数据结构实验:该文档呈现了数据结构实验的上机任务解答,包含代码实现与详尽的注释说明。此实验旨在评估学生对数据结构的掌握程度及编程能力。 2. 结构体数组:在C++语言中,结构体数组是一种常见的数据组织形式。结构体数组能够存储大量数据,并支持灵活的操作。在本资源中,结构体数组被用于存储赫夫曼树的节点信息。 3. 赫夫曼树:赫夫曼树是一种特殊的二叉树结构,其每个节点的权值等于其左右子树的权值之和。赫夫曼树在数据压缩、编码与解码等领域具有广泛的应用。在本资源中,赫夫曼树被用于实现数据的编码与解码功能。 4. 选择函数:选择函数是赫夫曼树的关键算法之一,负责选取赫夫曼树的根节点与叶节点。在本资源中,选择函数通过递归算法来选取赫夫曼树的根节点与叶节点。 5. 创建赫夫曼树:构建赫夫曼树是赫夫曼编码的核心步骤。在本资源中,采用递归算法来构建赫夫曼树,并将其存储在结构体数组中。 6. 赫夫曼编码:赫夫曼编码是一种可变长度的编码方式,利用赫夫曼树表示符号的频率信息。在本资源中,赫夫曼编码被用于对输入字符串进行编码,并存储在字符数组中。 7. 字符串操作:字符串操作是C++语言的基础功能之一。在本资源中,通过字符串操作实现字符串的连接与截取等操作。 8. 输入输出操作:输入输出操作是C++语言的基础功能之一。在本资源中,利用输入输出操作读取输入数据并输出结果。 9. 指针操作:指针操作是C++语言的基础功能之一。在本资源中,通过指针操作实现动态内存分配和...
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值