В стандартных классах Java 1620 английских слов

Статус русского языка в ИТ и в обществе. Аргументы за программирование на русском языке: история, культура, производительность труда, цивилизационная идентичность. Информационная безопасность.
Ответить
БудДен
Сообщения: 1793
Зарегистрирован: 07.10.18 14:01

В стандартных классах Java 1620 английских слов

Сообщение БудДен » 24.06.21 09:46

Раньше я писал, что в стандартных классах Java 5000 разных английских слов, но это была ошибка. Уточняю: нашёл свою программу, она говорит, что 1620 слов и вот список слов:

" Text Editable Accessible Component Extended Table Hyperlink Hypertext Icon Binding Key Object Relation Set Bundle Resource Role Selection State Streamable Change Model Sequence Value Mode Access Event Watchpoint Request Accordion Skin Exception Account Expired Locked Found Not Acl Entry Builder . Flag Permission Type View Attribute File Action Listener Map I U Activatable Failed Activate Flavor Data Activation Desc Group Stub Environment Command D Instantiator Monitor System Activator E V T C A Active L P M O Y R Q Completed Activity Required Adapter Operations Exists Already Helper Inactive Id Manager Name Existent Non Address Addressing Feature Responses Adjustable Adjustment 2 3 Adler Tag Bad Affine Transform Op Initialization Agent Load Alert Constraints Algorithm Method Generator Parameter Spi Parameters Spec All Composite Alpha Bound Holder Connected Light Ambient Ancestor Pane Anchor Animation Status Timer Array Annotated Construct Element Parameterized Tree Variable Wildcard Annotation Pair Error Format Mirror Doc Mismatch Visitor Any Seq Configuration App Control Module Login Appendable Foreground Hidden Applet Context Initializer Application Reopened Arc Double Float To Area Filter Scale Averaging Chart N G Arithmetic Queue Blocking Deque Bounds Of Out Index List Literal Reference Arrays Store Assertion Assert Assignment Association Notification Assoc Box Async Handler Channel Byte Asynchronous Provider Close Socket Server Boolean Atomic Integer Updater Field Long Markable Supported Move Stamped Connector Attaching Marshaller Attachment Part Unmarshaller Attach Operation Attr Iterator Character Attributed String Use In Impl Modification Attributes Color Font Paragraph Utilities Kind Exp Clip Audio Equalizer Reader Writer Encoding Stream Input Spectrum Track Authentication Authenticator Failure Requestor Result Retry Success Authorization Callback Authorize Author Auth Closeable Auto Autoscroll W Proxy Multicaster Stroke Axis Mark Tick Background Fill Image Position Repeat Size Backing X B S Binary Location Padding Combine Band Sample Banded Bar 4 6 Base Decoder Encoder F Baseline Resolution Multi Row Button Arrow Basic Borders Border Margin Menu Radio Rollover Split Toggle Item Check Chooser Editor Combo Renderer Popup Desktop Directory Formatted Utils Graphics H Factory Title Frame Internal Label Feel And Look Option Layout Panel Password Separator Progress Root Scroll Service Slider Spinner Divider Tabbed Header Caret Highlighter Tool Tip Viewport Update Batch Bean Child Support Container Membership Available Info Revoked Services Descriptor Property Beans Linker Bevel Consumer Bi Bidi Function Decimal Big Converter Operator Addr Ref Binder Bind Bindings Predicate Bit Blend Blob Block Bloom Blur Param Write Book Expression Supplier Bootstrap Style Compound Empty Etched Line Matte Titled Widths Meter Bounded Range Bounding Filler Boxed Break Breakpoint Barrier Broken Bubble Buffer Capabilities Contents Flip Buffered Output Overflow Pool Strategy Underflow Lookup Order 1 Cached Hint Cache Response Calendar Callable Statement Site Call Camera Job Print Cancelable Cancellation Cancelled Proceed Cannot Redo Undo Canonicalization Canvas Card Present Terminal Terminals Case Catalog Features Resolver Catch Category Section Cell Certificate Rep Valid Yet Parsing Path Cert Checker Trust Validator Reason Selector Chained Char Changed Channels Subset Unicode Script Coding Characters Conversion Charset Checkbox Checked Checksum Choice Dialog Chromaticity Chrono Date Local Time Chronology Period Unit Zoned Cipher Circle Class Cast Circularity Declaration Definition Transformer Loader Repository Loading Loaded Prepared Prepare Unload Cleaner Cleanable Client Interceptor Clipboard Content Owner Clob Clock Cloneable Clone Interrupt By Closed Connection Watch Codec Malfunction Coder Sets Code Signer Source Collapsed Collation Collator Collection Collections Collector Characteristics Collectors Adjust Convert Picker Space Column Comment Common Communication Comparable Comparator Compilable Compilation Compiled Compiler Future Completable Task Completion Completions Stage Behavior Resize Orientation Invocation Dynamic Guarding Based Edit Compression Hash Concurrent Linked Navigable Skip Condition Conditional Loop Config Confirmation Connect Pending Argument Selected Console Constant Constructor Properties Policy Traversal Focus Display Rendered Contextual Continue Controller Comparison Convolve Cookie Copies On Copy Radii Corner Latch Down Count Completer Counted Counter Credential Crop Primitive Crypto Rule Face Import Media Meta Css Page Parser Parse Inline Stylesheet Sheet Unknown Curve Cubic Cull Currency Current Cursor Customizer Marshal Custom Cycle Cyclic Cylinder Amount Database Int Short Datagram Packet Truncation Datatype Constants Interface Symbols Formatter At Creation Processing Syntax Week Day Debugger Debug Snippet Declared Decl Default Kit Beep Cut Typed Insert Tab Paste Painter Highlight Keyboard Theme Metal Node Mutable Delegate Persistence Sorter Wrapper Single Document Styled Undoable Validation Deflater Delayed Delay Delegation Deprecated Test Depth Derive Description Read Sede Design Destination Destroyable Destroy Detail Gen Private Public Diag Diagnostic Exclusion Modal Modality Dictionary Digest Dimension Dir Direct Execution Directive Dispatch Displacement Dn Reporter Doclet Positions Scanner Trees Documentation Documented Bypass Fragment Combiner Domain Dom Implementation Registry Locator Sign Structure Validate Dos Accumulator Adder Statistics Summary Unary While Do Download Dragboard Drag Gesture Recognizer Drop Motion Draw Drbg Capability Instantiation Bytes Next Reseed Driver Shadow Target Scroller Params Flags Duplicate Duration Dyn Enum Fixed Struct Union 2m Fp Point Effect Inner 7 8 9 Elements Origin Ellipse Elliptic Stack Enabled Encoded Encrypted Encryption End Endpoint For Enhanced Entity Enumeration Era Erroneous Eval Chain Dispatcher Settings Exc Message Exchanger Executable Member Bytecodes Install Termination Engine Implemented Run Stopped User Env Executor Executors Mechanism Exemption Exif Interoperability Parent Veto Expand Experimental Export Exports Options Session Extension Installer Externalizable Transition Fade Over Fail Fault Fax Fidelity Lock Interruption Filename Open Filer Files Save Systems Detector Visit Filtered Find Finishings Height Flattening Recorder Flight Flow Processor Publisher Subscriber Subscription Flushable Cause Weight Metrics Posture Render Smoothing Join Fork Thread Worker Blocker Managed Formattable Form Submit Forwarding Java Forward Frequency Functional Gap Garbage Gathering Gauge Gaussian Gc General Security Generated Generic Signature Geo Glow Justification Glyph Vector Goto Paint Gradient Graphic Template Device Translucency Window Gray Gregorian Bag Grid Alignment Principal Util Guard Guarded Exporter Guards Initialized Z Handshake Controls Has Hashtable Headers Headless Hex Hierarchy Hijrah Hit Direction Horizontal Verifier Hostname Host Spot Hot Pos Body Div Link Head Heading Html Is Legend Mod Opt Pre Quote Select Caption Col Http Redirect Version Exchange Configurator Https Timeout Profile Identifier Identity Scope Uniqueness If Invalid Metadata Warning Illegal Caller Arguments Precision Width Receive Unbind Locale Illformed Observer Pattern Producer Transcoder Specifier Imaging Immutable Implicit Inaccessible Incompatible Incomplete Inconsistent Indexed Indirection Inet Inflater Inheritable Inherit Inherited Initial Initializable Ldap Init Requests Sec Inquire Insets Instance Instant Instrument Instrumentation Resources Insufficient Integration International Interpolatable Interpolator Interrupted Naming Interruptible Intersection Introspection Introspector J Invalidation Number Midi Preferences Search Slot Transaction Invocable Invoke Invoker 0 Iso Fields Istring Selectable Iterable Iv Japanese Jar Javac Shell Global Plugin Jdbc Jdi Initiator Abstract Layer Layered Runtime Addressable Jndi Random Handling Multiple Sides Until Hold Impressions Octets K Processed Sheets From Originating Priority Reasons Joinable Huffman Julian Cred Kerberos Ticket Kernel Agreement Combination Modifier Post Management Keymap Purpose Protection Secret Trusted 5 Krb Labeled Ladder Lambda Metafactory Language Last Launching Placement Referral Lease Level Lexical Lifespan Distant Lighting Exceeded Limit Linear Measurer Sorting Unavailable Linkage Transfer Listening Filtering Country Locatable Locate Logger Logging Logical Log Record Serializer Mac Mailcap Main Malformed Manage Manifest Mapped Marshalled Mask Match Matcher Material Math Matrix Registration Forwarder Marker Player Printable Engineering Other Tracker Tray Memory Usage Mouse Shortcut Mesh Prop Messager Flush Palette Folder Leaf Exit Handle Proxies Handles Receiver Transmitter Mime Mimetypes Minguo Minimal Mirrored Types Missing Mixer Let Mnemonic Broadcaster Observable Modifiable Modifiers Opens Provides Requires Uses Finder Entered Contended Enter Setting Waited Wait Month Wheel Multicast Packed Pixel Master Mutation Arg Named Namespace Ext Nashorn Native Navigation Negative Nested Nesting Net Network New Nimbus Def No Noninvertible Invertible Readable Writable Normalized Normalizer Route Servant Such Notation Compliant Identifiable Emitter Notifying Serializable Numeric Sid Primary Null Pointer Documents Jobs Intervening Up Shaper Obj Collected Get Put Objects Ocean Octet Offset Oid Oneway Operating Optional Requested Assigned Keys Overlapping Overlay Override Overrun Pack Packer Unpacker Package Pageable Quality Results Paged Ranges Minute Per Pages Pagination Radial Repeating Paper Parallel Parameterizable Parenthesized Parsed Delegator Partial Patch Paths Machine Virtual Searching Pause Peer Percentage Permissions Persistent Perspective Phantom Phaser Phong Pick Pie Pipe Sink Piped Grabber Interleaved Revocation Plain Platform Qualifier Polygon Polyline Pooled Port Remote Portable Unreachable Bias Posix Preference Preloader Presentation Printer Abort Accepting Make More Manufacturer Privileged Process Instruction Program Indicator Prompt Protocol Family Pseudo Specified Pushback Quad Nameable Qualified Query Queued Quit Raster Rdn Only Realm Recorded Trace Recording Rect Rectangle Shape Rectangular Recursive Reentrant Referenceable Schemes Uri Reflection Reflective Reflect Refreshable Refresh Reg Region Registerable Registered Rejected Relational Relinkable Remarshal Renderable Hints Rendering Repaint Repeatable Replicate Requesting Rescale Resolved Resolve Accuracy Approver Denied Respect Retention Retrieval Return Reverb Max Robot Unresolved Rotate Round Rounding Lifetime Mapper Sort Crt Prime Runnable Scheduled Varargs Safe Sasl Savepoint Recognized Scatter Scattering Scene Antialiasing Schema Violation Screen Sleep Scrollable Scrollbar Units Vertical Sctp Standard Streams Sealed Secondary Secure Seekable See Segment Semaphore Send Tone Sepia Sequencer Sync Sequential Serial Datalink Serialized Information Severity Sharding Shear Collate Shutdown Side Signed Simple Styleable Zone Since Requirements Skeleton Skinnable Snapshot Sub Envelope Handlers Sockets Soft Solaris Sorted Soundbank Analysis Completeness Names Suggestion Sphere Splash Spliterator Spliterators Splittable Spread Spring Constraint Integrity Transient Recoverable Rollback Unverified Ssl Stacked Walker Charsets Kinds Start Tls Static St Step Stop Corrupted Streaming Tokenizer Strict Concat Joiner Cap Bold Italic Underline Subject Submission Subtitle Values Warnings Suppress Swing Swipe Switch Synchronized Synchronous Synth Synthesizer Sysex Pressure Tabable Expander Closing Tabular Tagged Taglet Targeted Taskbar Templates Temporal Accessor Adjuster Adjusters Queries Texture Buddhist Thai Uncaught Death Runs Oldest Discard Threshold Throttled Throwable Throws Throw Tie Tile Timeline Timespan Timestamp Toolkit Tooltip Listeners Many Too Top Touch Transactional Rolledback Transferable Transformation Translate Translator Transparency Transport Listen Expansion Will Triangle Try Defaults Lazy Unchecked Undeclared Unexpected Unicast Unix Unmappable Unmarshal Unmodifiable Unrecoverable Unreferenced Unsatisfied Unsigned Unsolicited Unsupported Dereferencer Defined Jvm Var Verify Vertex Vetoable Video Visibility Modified Be Disconnected Disconnect Voice Void Volatile Watchable Weak Web History Refs When With Wrapped Aborted Wrong Issuer Xid Xml Allocator Adapters Mixed Ns Also Evaluation Evaluator Nodes Series Year Yield Zip Rules Zoom"

Код: Выделить всё

;;; -*- Mode:Lisp; system :ЯР.НАНО-ПАРСЕР; coding: utf-8; -*-

;; выделяем из https://docs.oracle.com/javase/9/docs/api/index.html?overview-summary.html все слова, начинающиеся с большой буквы, и считаем их. 

(named-readtables:in-readtable :buddens-readtable-a)
(in-package :ЯР.НАНО-ПАРСЕР)

(defun Читать-слова-разбитые-по-большим-буквам (Стр)
  (let ((сп-Контекст-разбора
         (MAKE-Контекст-разбора
               :Источник (ПОЛНОСТЬЮ-КЕШИРОВАННЫЙ-ПОТОК:Создать-Полностью-кешированный-поток-из-потока (make-string-input-stream Стр))
               :Функция-получения-класса 'БУКВЫ-И-МЕСТА-В-ФАЙЛЕ:Пб-Бу
               :Значение-КФ :eof
               )))
    (Читать-слова-с-большой-буквы-выкидывая-подчёркивания)))

(defun Читать-слова-с-большой-буквы-выкидывая-подчёркивания ()
  (perga
   (let Рез nil)
     (loop
       (cond ((eql (л-Класс) #\_) (Чит-э))
             ((eql (л-Класс) :eof)
              (return Рез))
             ((perga
               (let Б (БУКВЫ-И-МЕСТА-В-ФАЙЛЕ-ЛИЦО:Пб-Бу (Предвидеть-э)))
               (eql Б (russian-budden-tools:char-upcase-cyr Б)))
              (let ((Ч (Читать-слово-с-большой-буквы-а-буква-вот (БУКВЫ-И-МЕСТА-В-ФАЙЛЕ-ЛИЦО:Пб-Бу (Чит-э)))))

                (when Ч (push Ч Рез))))
             (t ; если что-то не так, просто пропускаем
              (Чит-э)
               )))))

(defun Читать-слово-с-большой-буквы-а-буква-вот (Б)
 (perga
   (let Рез (list Б))
   (loop (cond
          ((eql (л-Класс) #\_) (Чит-э))
          ((eql (л-Класс) :eof)
           (return (map 'string 'identity (reverse Рез))))
          ((perga
            (let Б (БУКВЫ-И-МЕСТА-В-ФАЙЛЕ-ЛИЦО:Пб-Бу (Предвидеть-э)))
            (eql Б (russian-budden-tools:char-upcase-cyr Б)))
           (return (map 'string 'identity (reverse Рез))))
          (t (push (БУКВЫ-И-МЕСТА-В-ФАЙЛЕ-ЛИЦО:Пб-Бу (Чит-э)) Рез))))))


(defparameter *t1* """
AccessibleEditableText
AccessibleExtendedComponent
AccessibleExtendedTable
... здесь пропущено более 5000 строк с именами класво
_PolicyStub
_Remote_Stub
_ServantActivatorStub
_ServantLocatorStub
""")

(defparameter *t2* (split-sequence:split-sequence #\newline *t1*))

(defparameter *t3* (mapcar 'Читать-слова-разбитые-по-большим-буквам *t2*))

(defparameter *h1* (make-hash-table :test 'equal))

(dolist (Большое-слово *t3*)
  (dolist (Слово Большое-слово)
    (setf (gethash Слово *h1*) t)))

(print (hash-table-count *h1*))
Последний раз редактировалось БудДен 24.06.21 09:49, всего редактировалось 1 раз.

БудДен
Сообщения: 1793
Зарегистрирован: 07.10.18 14:01

Re: В стандартных классах Java 1620 английских слов

Сообщение БудДен » 24.06.21 09:48

А вот другой подсчёт от Cyberax где-то тут https://www.rsdn.org/forum/flame.comp/7984849.all (он-то меня и поправил):

Около 1257, если убрать дубликаты и аббревиатуры:

Код: Выделить всё

import re
with open("words") as fl:
    lns = fl.readlines()
words = set()

for l in lns:
    for w in re.findall(r'[A-Z](?:[a-z]+|[A-Z]*(?=[A-Z]|$))', l):
        if len(w) > 2 and not w[1].isupper():
            words.add(w)

lst = [w for w in words]
lst.sort()
print(lst)
print(len(lst))

Ответить