Njira yoyeretsera data ya Spark Streaming
(I) DStream ndi RDD
Monga tikudziwira, kuwerengera kwa Spark Streaming kumachokera ku Spark Core, ndipo pakatikati pa Spark Core ndi RDD, chifukwa chake Spark Streaming iyeneranso kukhala yogwirizana ndi RDD.Komabe, Spark Streaming simalola ogwiritsa ntchito kugwiritsa ntchito RDD mwachindunji, koma zidule za malingaliro a DStream, DStream ndi RDD ndi maubale ophatikizika, mutha kumvetsetsa ngati mawonekedwe okongoletsa ku Java, ndiye kuti, DStream ndikuwonjezera RDD, koma khalidwe likufanana ndi RDD.
DStream ndi RDD onse ali ndi zinthu zingapo.
(1) khalani ndi zosintha zofanana, monga mapu, reduceByKey, ndi zina zotero, komanso zina zapadera, monga Window, mapWithSated, etc.
(2) onse ali ndi zochita, monga foreachRDD, count, etc.
Mtundu wamapulogalamu ndi wokhazikika.
(B) Kuyambitsa kwa DStream mu Spark Streaming
DStream ili ndi makalasi angapo.
(1) Makalasi opangira deta, monga InputDStream, makamaka DirectKafkaInputStream, etc.
(2) Makasitomala otembenuka, makamaka MappedDStream, ShuffledDStream
(3) makalasi otulutsa, makamaka monga ForEachDStream
Kuchokera pamwambapa, deta kuyambira pachiyambi (zolowera) mpaka kumapeto (zotuluka) zimachitidwa ndi DStream system, zomwe zikutanthauza kuti wogwiritsa ntchito sangathe kupanga mwachindunji ndikuyendetsa RDDs, zomwe zikutanthauza kuti DStream ili ndi mwayi ndi udindo wokhala nawo. ndi udindo pa moyo wa RDDs.
Mwanjira ina, Spark Streaming ili ndikuyeretsa basintchito.
(iii) Njira yopangira RDD mu Spark Streaming
Mayendedwe a ma RDD mu Spark Streaming ndi ovuta motere.
(1) Mu InputDStream, deta yolandiridwa imasinthidwa kukhala RDD, monga DirectKafkaInputStream, yomwe imapanga KafkaRDD.
(2) kenako kudzera pa MappedDStream ndi kusintha kwina kwa data, nthawi ino imatchedwa mwachindunji RDD yogwirizana ndi mapu osinthira.
(3) Pantchito ya kalasi yotulutsa, pokhapokha RDD ikawululidwa, mutha kuloleza wogwiritsa ntchito kusungirako komweko, kuwerengera kwina, ndi ntchito zina.