日韩毛片免费观看,久久精品国产99精品最新

<strong id="eoqe2"><nav id="eoqe2"></nav></strong>

<tr id="eoqe2"></tr><ul id="eoqe2"><pre id="eoqe2"></pre></ul>

<ul id="eoqe2"><pre id="eoqe2"></pre></ul>

當前位置： OFweek 人工智能網 > 其他 > 正文

Flink未來將與 Pulsar集成提供大規(guī)模的彈性數據處理

2019-05-17 14:29

Python進階學習交流

未來整合

Pulsar可以以不同的方式與Apache Flink集成。一些潛在的集成包括使用流式連接器為流式工作負載提供支持，并使用批量源連接器支持批量工作負載。Pulsar還提供對schema 的本地支持，可以與Flink集成并提供對數據的結構化訪問，例如使用Flink SQL作為在Pulsar中查詢數據的方式。最后，集成這些技術的另一種方法可能包括使用Pulsar作為Flink的狀態(tài)后端。由于Pulsar具有分層架構（Streams和Segmented Streams，由Apache Bookkeeper提供支持），因此將Pulsar用作存儲層并存儲Flink狀態(tài)變得很自然。

從體系結構的角度來看，我們可以想象兩個框架之間的集成，它使用Apache Pulsar作為統(tǒng)一的數據層視圖，Apache Flink作為統(tǒng)一的計算和數據處理框架和API。

現有集成

兩個框架之間的集成正在進行中，開發(fā)人員已經可以通過多種方式將Pulsar與Flink結合使用。例如，Pulsar可用作Flink DataStream應用程序中的流媒體源和流式接收器。開發(fā)人員可以將Pulsar中的數據提取到Flink作業(yè)中，該作業(yè)可以計算和處理實時數據，然后將數據作為流式接收器發(fā)送回Pulsar主題。這樣的例子如下所示：

／／ create and configure Pulsar consumer

PulsarSourceBuilder＜String＞builder ＝ PulsarSourceBuilder

．builder（new SimpleStringSchema（））

．serviceUrl（serviceUrl）

．topic（inputTopic）

．subscriptionName（subscription）；

SourceFunction＜String＞ src ＝ builder．build（）；

／／ ingest DataStream with Pulsar consumer

DataStream＜String＞ words ＝ env．addSource（src）；

／／ perform computation on DataStream （here a simple WordCount）

DataStream＜WordWithCount＞ wc ＝ words

．flatMap（（FlatMapFunction＜String， WordWithCount＞）（word， collector）－＞｛

collector．collect（new WordWithCount（word， 1））；

｝）

．returns（WordWithCount．class）

．keyBy（＂word＂）

．timeWindow（Time．seconds（5））

．reduce（（ReduceFunction＜WordWithCount＞）（c1， c2）－＞

new WordWithCount（c1．word， c1．count ＋ c2．count））；

／／ emit result via Pulsar producer

wc．addSink（new FlinkPulsarProducer＜＞（

serviceUrl，

outputTopic，

new AuthenticationDisabled（），

wordWithCount －＞ wordWithCount．toString（）．getBytes（UTF＿8），

wordWithCount －＞ wordWithCount．word）

）；

開發(fā)人員可以利用的兩個框架之間的另一個集成包括將Pulsar用作Flink SQL或Table API查詢的流式源和流式表接收器，如下例所示：

／／ obtain a DataStream with words

DataStream＜String＞ words ＝．．．

／／ register DataStream as Table ＂words＂ with two attributes （＂word＂，＂ts＂）．

／／＂ts＂ is an event－time timestamp．

tableEnvironment．registerDataStream（＂words＂， words，＂word， ts．rowtime＂）；

／／ create a TableSink that produces to Pulsar

TableSink sink ＝ new PulsarJsonTableSink（

serviceUrl，

outputTopic，

new AuthenticationDisabled（），

ROUTING＿KEY）；

／／ register Pulsar TableSink as table ＂wc＂

tableEnvironment．registerTableSink（

＂wc＂，

sink．configure（

new String［］｛＂word＂，＂cnt＂｝，

new TypeInformation［］｛Types．STRING， Types．LONG｝））；

／／ count words per 5 seconds and write result to table ＂wc＂

tableEnvironment．sqlUpdate（

＂INSERT INTO wc ＂＋

＂SELECT word， COUNT（＊） AS cnt ＂＋

＂FROM words ＂＋

＂GROUP BY word， TUMBLE（ts， INTERVAL ＇5＇ SECOND）＂）；

最后，Flink將批量工作負載與Pulsar集成為批處理接收器，其中所有結果在Apache Flink完成靜態(tài)數據集中的計算后被推送到Pulsar。這樣的例子如下所示：

／／ obtain DataSet from arbitrary computation

DataSet＜WordWithCount＞ wc ＝．．．

／／ create PulsarOutputFormat instance

OutputFormat pulsarOutputFormat ＝ new PulsarOutputFormat（

serviceUrl，

topic，

new AuthenticationDisabled（），

wordWithCount －＞ wordWithCount．toString（）．getBytes（））；

／／ write DataSet to Pulsar

wc．output（pulsarOutputFormat）；

結論

Pulsar和Flink都對應用程序的數據和計算級別如何以批量作為特殊情況流“流式傳輸”方式分享了類似的觀點。通過Pulsar的Segmented Streams方法和Flink在一個框架下統(tǒng)一批處理和流處理工作負載的步驟，有許多方法將這兩種技術集成在一起，以提供大規(guī)模的彈性數據處理。

<上一頁 1 2

本地收藏打印推薦給朋友

聲明： 本文由入駐維科號的作者撰寫，觀點僅代表作者本人，不代表OFweek立場。如有侵權或其他問題，請聯(lián)系舉報。

發(fā)表評論

共0條評論，0人參與

登錄登錄即可訪問所有OFweek服務

用戶名/郵箱/手機：
密碼：
忘記密碼？
用其他賬號登錄： QQ | 微信 | 新浪微博

請輸入評論內容...

請輸入評論/評論長度6~500個字

暫無評論

暫無評論

圖片新聞

最新發(fā)布

最新活動更多

一周熱點月點擊榜

企業(yè)服務廣告服務獵頭服務薪酬報告

人工智能獵頭職位更多

掃碼關注公眾號
OFweek人工智能網
獲取更多精彩內容

文章糾錯

x

_*文字標題：

_*糾錯內容：

聯(lián)系郵箱：

_*驗證碼：

看不清，點擊換一張

粵公網安備 44030502002758號

感谢您访问我们的网站，您可能还对以下资源感兴趣：

在线观看中文字幕亚洲

免费无码国产在线观看69 国产高潮无套免费视频国产精品日韩综合无码专区无码a∨高潮抽搐流白浆av

<blockquote id="0ooos"><tbody id="0ooos"></tbody></blockquote>