1HDFSCLI(1)                       User Commands                      HDFSCLI(1)
2
3
4

NAME

6       hdfscli-avro – an Avro extension for HdfsCLI
7

SYNOPSIS

9       hdfscli-avro schema [-a ALIAS] [-v...]  HDFS_PATH
10
11       hdfscli-avro  read  [-a ALIAS]  [-v...]   [-F FREQ | -n NUM] [-p PARTS]
12       HDFS_PATH
13
14       hdfscli write [-fa ALIAS] [-v...]  [-C CODEC] [-S SCHEMA] HDFS_PATH
15
16       hdfscli-avro -L | -h
17

OPTIONS

19   COMMANDS
20       schema Pretty print schema.
21
22       read   Read an Avro file from HDFS and output records as JSON to  stan‐
23              dard out.
24
25       write  Read  JSON  records  from  standard in and serialize them into a
26              single Avro file on HDFS.
27
28   ARGUMENTS
29       HDFS_PATH
30              Remote path to Avro file  or  directory  containing  Avro  part-
31              files.
32
33   OPTIONS
34       -C CODEC --codec=CODEC
35              Compression  codec.   Available values are among: null, deflate,
36              snappy.  [default: deflate]
37
38       -F FREQ --freq=FREQ
39              Probability of sampling a record.
40
41       -L --log
42              Show path to current log file and exit.
43
44       -S SCHEMA --schema=SCHEMA
45              Schema for serializing records.  If not passed, it will  be  in‐
46              ferred from the first record.
47
48       -a ALIAS--alias=ALIAS
49              Alias of namenode to connect to.
50
51       -f --force
52              Overwrite any existing file.
53
54       -h --help
55              Show a usage message and exit.
56
57       -n NUM--num=NUM
58              Cap number of records to output.
59
60       -p PARTS--parts=PARTS
61              Part-files  to  read.   Specify a number to randomly select that
62              many, or a comma-separated list of numbers to read  only  these.
63              Use  a  number  followed  by  a comma (e.g.  1,) to get a unique
64              part-file.  The default is to read all part-files.
65
66       -v --verbose
67              Enable log output.  Can be specified up to three times (increas‐
68              ing verbosity each time).
69

EXAMPLES

71       hdfscli-avro schema /data/impressions.avro
72       hdfscli-avro read -a dev snapshot.avro >snapshot.jsonl
73       hdfscli-avro read -F 0.1 -p 2,3 clicks.avro
74       hdfscli-avro write -f positives.avro <positives.jsonl -S "$(cat schema.avsc)"
75

SEE ALSO

77       hdfscli(1)
78
79
80
81                                 October 2021                       HDFSCLI(1)
Impressum