Spark教程
作者: 时海 风自在
aggregate计算平均值

aggregate用于计算总和及元素个数,进而可以求平均值

scala> var rdd1=sc.parallelize(Seq(1,2,3,4,5,6,7,8,9),3)
rdd1: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[6] at parallelize at <console>:24

scala> rdd1.aggregate((0,0))((x,y)=>(x._1+y,x._2+1),(x,y)=>(x._1+y._1,x._2+y._2))
res5: (Int, Int) = (45,9)


标签: ._、aggregate、rdd1、parallelize、平均值
一个创业中的苦逼程序员
  • 回复
隐藏