-
sam很实用,感谢分享
-
Spark教程
1、样本数据
每一行存一个json对象
{ "name": "Andy", "age": 30 } { "name": "Justin", "age": 19 } { "name": "tom", "age": 21 }文件路径为 example/input/data
2、加载数据
scala> val df=spark.read.json("example/input/data") ... df: org.apache.spark.sql.DataFrame = [age: bigint, name: string]
3、查看数据
scala> df.show +---+------+ |age| name| +---+------+ | 30| Andy| | 19|Justin| | 21| tom| +---+------+
4、查看表Schema
scala> df.printSchema root |-- age: long (nullable = true) |-- name: string (nullable = true)
5、数据查询基本操作
scala> df.select("name").show +------+ | name| +------+ | Andy| |Justin| | tom| +------+ scala> df.select($"name",$"age"+1).show +------+---------+ | name|(age + 1)| +------+---------+ | Andy| 31| |Justin| 20| | tom| 22| +------+---------+ scala> df.filter($"age">21).show +---+----+ |age|name| +---+----+ | 30|Andy| +---+----+