For example, to set quota on /user/tony directory:
$ hadoop fs -count -q /user/tony none inf none inf 8 4 12257 /user/tony $ hdfs dfsadmin -setSpaceQuota 10G /user/tony $ hadoop fs -count -q /user/tony none inf 10737418240 10737393196 8 4 12257 /user/tony
Column one (none) is the file count quota which is not set. Column two (inf) means infinite number of files may be still created in this directory. The third column (10737418240 - 10G) means the space quota, and the fourth column means (10737393196 - 9.9G) space left. If the quota is exceeded, any attempt to put new files into this directory are denied and an error message is returned. For example:
$ hadoop fs -put ./shell-cmd.txt /user/tony/ put: The DiskSpace quota of /user/tony is exceeded: quota = 10 B = 10 B but diskspace consumed = 25044 B = 24.46 KB
HDFS quota accounting:
Because HDFS is a distributed filesystem and many clients can be writing data to a directory at once, it would be difficult to evaluate each byte written against the remaining quota. What HDFS does is it assumes an entire block will be filled when it's allocated, which can create unintuitive error messages. Let's give an example. Let's say "/user/tony" has a quota of 2M, writing a 8KB file with a block size of 128M will cause a quota violation, because HDFS thinks you are actually writing 3 x 128 = 384 MB instead if 3 x 8 = 24KB (assume replication factor is 3).
$ hdfs dfsadmin -setSpaceQuota 2M /user/tony $ hadoop fs -count -q /user/tony none inf 2097152 2072108 8 4 12257 /user/tony $ hdfs getconf -confKey dfs.blocksize 134217728 $ hdfs getconf -confKey dfs.replication 3 $ fallocate -l 8K foo hadoop fs -put ./foo /user/tony 13/12/24 16:01:51 WARN hdfs.DFSClient: DataStreamer Exception org.apche.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota of /user/tony is exceeded: quota = 2097152 B = 2 MB but diskspace consumed = 402678228 B = 384.02 MB at org.apache.hadoop.hdfs.server.namenode.INodeDirectoryWithQuota.verifyQuota(INodeDirectoryWithQuota.java:161) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyQuota(FSDirectory.java:1633) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:1369) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addBlock(FSDirectory.java:351) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveAllocatedBlock(FSNamesystem.java:2662) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2326) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:501) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:299) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44954) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
To remove quota:
$ hdfs dfsadmin -clrSpaceQuota /user/tony
To set file count quotas, you can use: hdfs dfsadmin -setQuota number path and hdfs dfsadmin -clrQuota path, respectively.
No comments:
Post a Comment