1HDFSCLI(1) User Commands HDFSCLI(1)
2
3
4
6 hdfscli-avro – an Avro extension for HdfsCLI
7
9 hdfscli-avro schema [-a ALIAS] [-v...] HDFS_PATH
10
11 hdfscli-avro read [-a ALIAS] [-v...] [-F FREQ | -n NUM] [-p PARTS]
12 HDFS_PATH
13
14 hdfscli write [-fa ALIAS] [-v...] [-C CODEC] [-S SCHEMA] HDFS_PATH
15
16 hdfscli-avro -L | -h
17
19 COMMANDS
20 schema Pretty print schema.
21
22 read Read an Avro file from HDFS and output records as JSON to stan‐
23 dard out.
24
25 write Read JSON records from standard in and serialize them into a
26 single Avro file on HDFS.
27
28 ARGUMENTS
29 HDFS_PATH
30 Remote path to Avro file or directory containing Avro part-
31 files.
32
33 OPTIONS
34 -C CODEC --codec=CODEC
35 Compression codec. Available values are among: null, deflate,
36 snappy. [default: deflate]
37
38 -F FREQ --freq=FREQ
39 Probability of sampling a record.
40
41 -L --log
42 Show path to current log file and exit.
43
44 -S SCHEMA --schema=SCHEMA
45 Schema for serializing records. If not passed, it will be in‐
46 ferred from the first record.
47
48 -a ALIAS--alias=ALIAS
49 Alias of namenode to connect to.
50
51 -f --force
52 Overwrite any existing file.
53
54 -h --help
55 Show a usage message and exit.
56
57 -n NUM--num=NUM
58 Cap number of records to output.
59
60 -p PARTS--parts=PARTS
61 Part-files to read. Specify a number to randomly select that
62 many, or a comma-separated list of numbers to read only these.
63 Use a number followed by a comma (e.g. 1,) to get a unique
64 part-file. The default is to read all part-files.
65
66 -v --verbose
67 Enable log output. Can be specified up to three times (increas‐
68 ing verbosity each time).
69
71 hdfscli-avro schema /data/impressions.avro
72 hdfscli-avro read -a dev snapshot.avro >snapshot.jsonl
73 hdfscli-avro read -F 0.1 -p 2,3 clicks.avro
74 hdfscli-avro write -f positives.avro <positives.jsonl -S "$(cat schema.avsc)"
75
77 hdfscli(1)
78
79
80
81 October 2021 HDFSCLI(1)