Documentation

The Java™ Tutorials
Hide TOC
Set Implementations实现
Trail: Collections
Lesson: Implementations

Set Implementations实现

The Set implementations are grouped into general-purpose and special-purpose implementations.Set实现分为通用和专用实现。

General-Purpose Set Implementations通用集合实现

There are three general-purpose Set implementations — HashSet, TreeSet, and LinkedHashSet. 有三种通用Set实现#151; HashSetTreeSetLinkedHashSetWhich of these three to use is generally straightforward. 使用这三个选项中的哪一个通常很简单。HashSet is much faster than TreeSet (constant-time versus log-time for most operations) but offers no ordering guarantees. HashSetTreeSet快得多(对于大多数操作来说,是固定时间还是日志时间),但不提供排序保证。If you need to use the operations in the SortedSet interface, or if value-ordered iteration is required, use TreeSet; otherwise, use HashSet. 如果需要使用SortedSet接口中的操作,或者如果需要按值排序的迭代,请使用TreeSet;否则,使用HashSetIt's a fair bet that you'll end up using HashSet most of the time.可以肯定的是,你最终大部分时间都会使用HashSet

LinkedHashSet is in some sense intermediate between HashSet and TreeSet. 在某种意义上介于HashSetTreeSet之间。Implemented as a hash table with a linked list running through it, it provides insertion-ordered iteration (least recently inserted to most recently) and runs nearly as fast as HashSet. 它作为一个哈希表实现,并在其中运行一个链表,它提供了按插入顺序的迭代(从最近插入到最近插入),运行速度几乎与HashSet一样快。The LinkedHashSet implementation spares its clients from the unspecified, generally chaotic ordering provided by HashSet without incurring the increased cost associated with TreeSet.LinkedHashSet实现使其客户机免受HashSet提供的未指明、通常混乱的排序,而不会增加与TreeSet相关的成本。

One thing worth keeping in mind about HashSet is that iteration is linear in the sum of the number of entries and the number of buckets (the capacity). 关于HashSet,值得记住的一点是,迭代在条目数和桶数(容量)之和上是线性的。Thus, choosing an initial capacity that's too high can waste both space and time. 因此,选择过高的初始容量可能会浪费空间和时间。On the other hand, choosing an initial capacity that's too low wastes time by copying the data structure each time it's forced to increase its capacity. 另一方面,选择太低的初始容量会浪费时间,因为每次强制增加容量时都会复制数据结构。If you don't specify an initial capacity, the default is 16. 如果不指定初始容量,默认值为16。In the past, there was some advantage to choosing a prime number as the initial capacity. 在过去,选择素数作为初始容量有一些优势。This is no longer true. 这不再是事实。Internally, the capacity is always rounded up to a power of two. 在内部,容量总是四舍五入到二的幂。The initial capacity is specified by using the int constructor. 初始容量由int构造函数指定。The following line of code allocates a HashSet whose initial capacity is 64.下面的代码行分配一个初始容量为64的HashSet

Set<String> s = new HashSet<String>(64);

The HashSet class has one other tuning parameter called the load factor. HashSet类还有一个名为load factor的调优参数。If you care a lot about the space consumption of your HashSet, read the HashSet documentation for more information. 如果您非常关心HashSet的空间消耗,请阅读HashSet文档以了解更多信息。Otherwise, just accept the default; it's almost always the right thing to do.否则,就接受默认值;这几乎总是正确的做法。

If you accept the default load factor but want to specify an initial capacity, pick a number that's about twice the size to which you expect the set to grow. 如果您接受默认的负载系数,但希望指定初始容量,请选择一个大约是预期集增长到的大小的两倍的数字。If your guess is way off, you may waste a bit of space, time, or both, but it's unlikely to be a big problem.如果你的猜测有偏差,你可能会浪费一些空间、时间,或者两者都浪费,但这不太可能是个大问题。

LinkedHashSet has the same tuning parameters as HashSet, but iteration time is not affected by capacity. LinkedHashSetHashSet具有相同的调优参数,但迭代时间不受容量的影响。TreeSet has no tuning parameters.TreeSet没有调整参数。

Special-Purpose Set Implementations特殊目的集实现

There are two special-purpose Set implementations — EnumSet and CopyOnWriteArraySet.有两种特殊用途的Set实现—EnumSetCopyOnWriteArraySet

EnumSet is a high-performance Set implementation for enum types. EnumSet是针对枚举类型的高性能Set实现。All of the members of an enum set must be of the same enum type. 枚举集的所有成员必须是同一枚举类型。Internally, it is represented by a bit-vector, typically a single long. 在内部,它由一个位向量表示,通常是一个longEnum sets support iteration over ranges of enum types. 枚举集支持枚举类型范围内的迭代。For example, given the enum declaration for the days of the week, you can iterate over the weekdays. 例如,给定一周中几天的枚举声明,您可以迭代工作日。The EnumSet class provides a static factory that makes it easy.EnumSet类提供了一个静态工厂,使其变得简单。

for (Day d : EnumSet.range(Day.MONDAY, Day.FRIDAY))
        System.out.println(d);

Enum sets also provide a rich, typesafe replacement for traditional bit flags.枚举集还为传统的位标志提供了丰富的、类型安全的替代品。

EnumSet.of(Style.BOLD, Style.ITALIC)

CopyOnWriteArraySet is a Set implementation backed up by a copy-on-write array. CopyOnWriteArraySet是一个Set实现,由写时拷贝阵列备份。All mutative operations, such as add, set, and remove, are implemented by making a new copy of the array; no locking is ever required. 所有的变异操作,如addsetremove,都是通过创建数组的新副本来实现的;不需要锁。Even iteration may safely proceed concurrently with element insertion and deletion. 甚至迭代也可以安全地与元素插入和删除同时进行。Unlike most Set implementations, the add, remove, and contains methods require time proportional to the size of the set. 与大多数Set实现不同,addremovecontains方法需要与Set大小成比例的时间。This implementation is only appropriate for sets that are rarely modified but frequently iterated. 这种实现只适用于很少修改但经常迭代的集合。It is well suited to maintaining event-handler lists that must prevent duplicates.它非常适合维护必须防止重复的事件处理程序列表。


Previous page: Implementations
Next page: List Implementations