A: UTF-16 uses a single 16-bitcode unit to encode the most common 63K characters, and a pair of 16-bit code unites, called surrogates, to encode the 1M less commonly used characters in Unicode.
Originally, Unicode was designed as a pure 16-bit encoding, aimed at representing all modern scripts. (Ancient scripts were to be represented with private-use characters.) Over time, and especially after the addition of over 14,500 composite characters for compatibility with legacy sets, it became clear that 16-bits were not sufficient for the user community. Out of this arose UTF-16. [AF]
Q: What is the definition of UTF-8?
A. UTF-8 is the byte-oriented encoding form of Unicode. For details of its definition, see Section 2.5 “Encoding Forms” and Section 3.9 “ Unicode Encoding Forms ” in the Unicode Standard. See, in particular, Table 3-5 UTF-8 Bit Distribution and Table 3-6 Well-formed UTF-8 Byte Sequences, which give succinct summaries of the encoding form. Also see sample code which implements conversions between UTF-8 and other encoding forms. Make sure you refer to the latest version of the Unicode Standard, as the Unicode Technical Committee has tightened the definition of UTF-8 over time to more strictly enforce unique sequences and to prohibit encoding of certain invalid characters. There is an Internet RFC 3629 about UTF-8. UTF-8 is also defined in Annex D of ISO/IEC 10646. [MD Q: Can Unicode text be represented in more than one way?
A: Yes, there are several possible representations of Unicode data, including UTF-8, UTF-16 and UTF-32. In addition, there are compression transformations such as the one described in the Unicode Technical Report #6: A Standard Compression Scheme for Unicode. [MD]
Q: What is a UTF?
A: A Unicode transformation format (UTF) is an algorithmic mapping from every Unicode code point (except surrogate code points) to a unique byte sequence. The ISO/IEC 10646 standard uses the term “UCS transformation format” for UTF; the two terms are merely synonyms for the same concept.
Each UTF is reversible, thus every UTF supports lossless round tripping: mapping from any Unicode coded character sequence S to a sequence of bytes and back will produce S again. To ensure round tripping, a UTF mapping must also map all code points that are not valid Unicode characters to unique byte sequences. These invalid code points are the 66 noncharacters (including FFFE and FFFF), as well as unpaired surrogates.
The SCSU compression method, even though it is reversible, is not a UTF because the same string can map to very many different byte sequences, depending on the particular SCSU compressor
What is the difference between SSH and SSL?
________________________________________
SSH (Secure Shell) and SSL (Secure Sockets Layer) can both be used to secure communications across the Internet. This page tries to explain the differences between the two in easily understood terms.
SSL was designed to secure web sessions; it can do more, but that's the original intent.
SSH was designed to replace telnet and FTP; it can do more, but that's the original intent.
SSL is a drop-in with a number of uses. It front-ends HTTP to give you HTTPS. It can also do this for POP3, SMTP, IMAP, and just about any other well-behaved TCP application. It's real easy for most programmers who are creating network applications from scratch to just grab an SSL implementation and bundle it with their app to provide encryption when communicating across the network via TCP. Check out: stunnel.org.
SSH is a swiss-army-knife designed to do a lot of different things, most of which revolve around setting up a secure tunnel between hosts. Some implementations of SSH rely on SSL libraries - this is because SSH and SSL use many of the same encryption algorithms (i.e. TripleDES).
SSH is not based on SSL in the sense that HTTPS is based on SSL. SSH does much more than SSL, and they don't talk to each other - the two are different protocols, but have some overlap in how they accomplish similiar goals.
SSL by itself gives you nothing - just a handshake and encryption. You need an application to drive SSL to get real work done.
SSH by itself does a whole lot of useful stuff that allows users to perform real work. Two aspects of SSH are the console login (telnet replacement) and secure file transfers (ftp replacement), but you also get an ability to tunnel (secure) additional applications, enabling a user to run HTTP, FTP, POP3, and just about anything else THROUGH an SSH tunnel.
Without interesting traffic from an application, SSL does nothing. Without interesting traffic from an application, SSH brings up an encrypted tunnel between two hosts which allows you to get real work done through an interactive login shell, file transfers, etc.
Last comment: HTTPS does not extend SSL, it uses SSL to do HTTP securely. SSH does much more than SSL, and you can tunnel HTTPS through it! Just because both SSL and SSH can do TripleDES doesn't mean one is based on the other.
As a parent you may have concerns about the content your children encounter as they surf the Web. Internet Explorer 6 helps you safeguard your family's browsing experience with Content Advisor, which can be used to control the Web sites that your family can view. With Content Advisor, you can give your children access to a specific list of Web sites that you allow and prevent them from accessing others. Find out how to use it so you can rest easier.
Test suite
From Wikipedia, the free encyclopedia
(Redirected from Executable test suite)
Jump to: navigation, search
To meet Wikipedia's quality standards and make it more accessible to a general audience, this article may require cleanup.
The introduction to this article provides insufficient context for those unfamiliar with the subject matter.
Please help Wikipedia by improving the introduction according to the guidelines laid out at Wikipedia:Guide to layout. You can discuss the issue on the talk page.
The most common term for a collection of test cases is a test suite. The test suite often also contains more detailed instructions or goals for each collection of test cases. It definitely contains a section where the tester identifies the system configuration used during testing. A group of test cases may also contain prerequisite states or steps, and descriptions of the following tests.
Collections of test cases are sometimes incorrectly termed a test plan. They may also be called a test script, or even a test scenario.
An executable test suite is a test suite that is ready to be executed. This usually means that there exists a test harness that is integrated with the suite and such that the test suite and the test harness together can work on a sufficiently detailed level to correctly communicate with the system under test (SUT).
The counterpart of an executable test suite is an abstract test suite. However, often terms test suites and test plans are used, roughly with the same meaning as executable and abstract test suites, respectively.
[edit]
See also
test harness
In software testing, a test harness is a collection of software tools and test data configured to test a program unit by running it under varying conditions and monitor its behavior and outputs. It has two main parts namely, Test execution engine and the test script repository.
This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)
localization
Customizing software and documentation for a particular country. It includes the translation of menus and messages into the native spoken language as well as changes in the user interface to accommodate different alphabets and culture. See internationalization and l10n
globalization
Operating around the world. Although many large companies have globalized for decades, the Web, more than any other phenomenon, has enabled the smallest company to have a global presence. See localization.
internationalization
The support for monetary values, time and date for countries around the world. It also embraces the use of native characters and symbols in the different alphabets. See localization, i18n, Unicode and IDN
Briefcase
In Windows 95/98, a system folder used for synchronizing files between two computers, typically a desktop and laptop computer. Files to be worked on are placed into a Briefcase, which is then transferred to the second machine via floppy, cable or network. The Briefcase is then brought back to the original machine after its contents have been edited on the second machine, and a special update function replaces the original files with the new ones.
DNS
(Domain Name System) A system for converting host names and domain names into IP addresses on the Internet or on local networks that use the TCP/IP protocol. For example, when a Web site address is given to the DNS either by typing a URL in a browser or behind the scenes from one application to another, DNS servers return the IP address of the server associated with that name.
In this hypothetical example, WWW.COMPANY.COM would be converted into the IP address 204.0.8.51. Without DNS, you would have to type the four numbers and dots into your browser to retrieve the Web site, which of course, you can do. Try finding the IP of a favorite Web site and type in the dotted number instead of the domain name!
Your continued donations keep Wikipedia running!
Web crawler
From Wikipedia, the free encyclopedia
Jump to: navigation, search
It has been suggested that this article or section be merged with Spidering. (Discuss)
See WebCrawler for the specific search engine of that name.
A web crawler (also known as a web spider or web robot) is a program which browses the World Wide Web in a methodical, automated manner. Other less frequently used names for web crawlers are ants, automatic indexers, bots, and worms (Kobayashi and Takeda, 2000).
Web crawlers are mainly used to create a copy of all the visited pages for later processing by a search engine, that will index the downloaded pages to provide fast searches. Crawlers can also be used for automating maintenance tasks on a web site, such as checking links or validating HTML code. Also, crawlers can be used to gather specific types of information from Web pages, such as harvesting e-mail addresses (usually for spam).
A web crawler is one type of bot, or software agent. In general, it starts with a list of URLs to visit, called the seeds. As the crawler visits these URLs, it identifies all the hyperlinks in the page and adds them to the list of URLs to visit, called the crawl frontier. URLs from the frontier are recursively visited according to a set of policies
Spidering
From Wikipedia, the free encyclopedia
Jump to: navigation, search
It has been suggested that this article or section be merged into Web Crawler. (Discuss)
"Spidering" is the process of using an automated script or bot to go to one or many websites and pull information to be stored for later use. The script can be targeted towards any set of information desired by the author.
Many legitimate sites use spidering as a means of providing up to date data. Froogle is a good example -- type a product into the froogle search and froogle will spider sites to retrieve the most current prices available.
As spiders can search data much quicker and in greater depth than human searches, they can have a crippling impact on the performance of a site. Needless to say if a single spider is performing multiple searches per second and demanding full result sets, a server would have a hard time keeping up with requests from multiple spiders.
"Spidering" is a synonym for "web crawling" (see "Web Crawler").
Search
Focus on Linux grep Command
Related Terms
• awk Command
Definition: Unix command "grep" allows you to search for a pattern in a list of files. Such patterns are specified as "regular expressions", which in their simplest form are "strings", such as words or sentence fragments.
The way we search for a string with grep is to put the words we are searching for together in single quotes.
• The syntax: % grep pattern file-name-1 file-name-2 …file-name-n
• An example: % grep 'mountain bike' sports hobbies
As a result of entering this command the operating system will print all the lines in the file "sports" and the file "hobbies" that contain the string "mountain bike". By default the line will be printed on the computer screen (in the shell window, where the command was issued).
How a Web Server Works
You can see from this description that a Web server can be a pretty simple piece of software. It takes the file name sent in with the GET command, retrieves that file and sends it down the wire to the browser. Even if you take into account all of the code to handle the ports and port connections, you could easily create a C program that implements a simple Web server in less that 500 lines of code. Obviously, a full-blown enterprise-level Web server is more involved, but the basics are very simple.
Most servers add some level of security to the serving process. For example, if you have ever gone to a Web page and had the browser pop up a dialog box asking for your name and password, you have encountered a password-protected page. The server lets the owner of the page maintain a list of names and passwords for those people who are allowed to access the page; the server lets only those people who know the proper password to see the page. More advanced servers add further security to allow an encrypted connection between server and browser, so that sensitive information like credit card numbers can be sent on the Internet.
How Search Engines Work
By Danny Sullivan, Editor-In-Chief
October 14, 2002
The term "search engine" is often used generically to describe both crawler-based search engines and human-powered directories. These two types of search engines gather their listings in radically different ways.
Crawler-Based Search Engines
Crawler-based search engines, such as Google, create their listings automatically. They "crawl" or "spider" the web, then people search through what they have found.
If you change your web pages, crawler-based search engines eventually find these changes, and that can affect how you are listed. Page titles, body copy and other elements all play a role.
Human-Powered Directories
A human-powered directory, such as the Open Directory, depends on humans for its listings. You submit a short description to the directory for your entire site, or editors write one for sites they review. A search looks for matches only in the descriptions submitted.
Changing your web pages has no effect on your listing. Things that are useful for improving a listing with a search engine have nothing to do with improving a listing in a directory. The only exception is that a good site, with good content, might be more likely to get reviewed for free than a poor site.
"Hybrid Search Engines" Or Mixed Results
In the web's early days, it used to be that a search engine either presented crawler-based results or human-powered listings. Today, it extremely common for both types of results to be presented. Usually, a hybrid search engine will favor one type of listings over another. For example, MSN Search is more likely to present human-powered listings from LookSmart. However, it does also present crawler-based results (as provided by Inktomi), especially for more obscure queries.
Win Registry
Starting with Windows 95, the Registry is a database that holds configuration data about the hardware and environment of the PC. It is made up of the SYSTEM.DAT and USER.DAT files.
The Registry can be edited directly, but that is usually only done for very technical enhancements or as a last resort. Routine access is done via the Windows control panels through the Properties option. Right clicking on almost every icon in Windows brings you the option of selecting Properties. See Win Properties.
Registry Details
To get into the Registry itself, run the Registry Editor program (REGEDIT.EXE) from the Run command in the Start menu. The Registry contains five folders. In Windows 95/98, there is a sixth folder.
Database normalization
In relational databases, normalization is a process that eliminates redundancy, organizes data efficiently, and reduces the potential for anomalies during data operations and improves data consistency. The formal classifications for quantifying "how normalized" a relational database are called normal forms (abbrev. NF).
A non-normalized database is vulnerable to data anomalies because it stores data redundantly. If data is stored in two locations, but later is updated in only one of the locations, then the data is inconsistent; this is referred to as an "update anomaly". A normalized database stores non-primary key data in only one location.
Normalized databases have a design that reflects the true dependencies between tracked quantities, allowing quick updates to data with little risk of introducing inconsistencies. Instead of attempting to lump all information into one table, data is spread out logically into many tables.
near pointer
In an x86 segmented address, a memory address within a single segment (the offset). Contrast with far pointer.
far pointer
In an Intel x86 segmented address, a memory address that includes both segment and offset. Contrast with near pointer.
DATABASE 2) A relational DBMS from IBM that was originally developed for its mainframes. It is a full-featured SQL language DBMS that has become IBM's major database product. Known for its industrial strength reliability, IBM has made DB/2 available for all of its own platforms, including OS/2, OS/400, AIX (RS/6000) and OS/390, as well as for Solaris on Sun systems and HP-UX on HP 9000 workstations and servers. See DB2 UDB.
Microsoft SQL Server
A relational DBMS from Microsoft that runs on Windows NT servers. It is Microsoft's high-end client/server database and a key component in its BackOffice suite of server products. SQL Server was originally developed by Sybase and also sold by Microsoft for OS/2 and NT. In 1992, Microsoft began development of its own version. Today, Microsoft SQL Server and Sybase SQL Server are independent products with some compatibility.
ipv4 is version 4 of the Internet Protocol (IP) and it is the first version of the Internet Protocol to be widely deployed. IPv4 is the dominant network layer protocol on the internet and when ignoring its successor — IPv6 — it is the only protocol used on the internet.
It is described in IETF RFC 791 (September 1981) which obsoleted RFC 760 (January 1980). The United States Department of Defense also standardized it as MIL-STD-1777.
IPv4 is a data-oriented protocol to be used on a packet switched internetwork (e.g., Ethernet). It is a best effort protocol in that it doesn't guarantee delivery. It doesn't make any guarantees on the correctness of the data; it may result in duplicated packets and/or packets out-of-order. All of these things are addressed by an upper layer protocol (e.g., TCP, UDP).
The entire purpose of IP is to provide unique global computer addressing to ensure that two computers over the internet can uniquely identify one another
No comments:
Post a Comment