Why is URL Filtering required and how it is accomplished
What is URL Filtering?
Suppose you type the name of your favorite social networking site on the web browser and it displays a message like “The policy of this organization doesn’t allow you to browse that website” and does not let you access the site from office, there is a URL filter that has been put in place by your IT department. So, a URL filter is used to basically categorize the websites on the internet and either allow/block the access to them to the web users of the organization either by referring to an already categorized central database (maintained by URL filtering vendors) or by classifying the websites in real time. URL filtering can also be made applicable only during certain times of a day or days of a week, if required.
Why is URL Filtering required?
URL filtering is required to stop the users of an organization from accessing those websites during working hours that:
¤ Drains their productivity
¤ Lets them view objectionable content from work place
¤ Is bandwidth intensive and hence creates a strain on resources
How URL Filtering is done?
URL Filtering is basically done by the URL Filtering vendors by maintaining a highly categorized database of most of the websites in the internet and either allowing access to them or dis-allowing access to them to the internet users of an organization either at all times or during certain times of a day. The policies of which categories of sites is to be allowed/dis-allowed to the users of an organization could be set by the IT department personnel of enterprise companies through a web-based interface provided by URL filters. So, there is a local hardware appliance or software application running on a server that connects to a central database of the URL filtering vendors which enable to block individual websites.
There might be a local database, which is updated fully or partially from the central database. But updating them completely might have its own productivity problems like bandwidth or memory usage. Some vendors update such databases (local) as and when the users visit the websites (it typically takes only few milliseconds to do so).
A website can be categorized in a single category or multiple categories and the blocking can be done appropriately. For example, websites can be allowed to be accessed if they are categorized as sports but not if they are categorized as sports and gambling.
Generally, the URL Filtering companies rate the websites based on their domain names (In addition to the URL’s) as one domain can have multiple URL’s that tend to increase frequently. Optionally, even the IP addresses of the domain names can be included while rating the domains. The sub-domains also need to be classified in-addition to the main domains (For blogs, etc) and the intermediate pages need to be classified in addition to the primary pages or based on primary pages (Like translation sites or sites that display images from other websites). Websites in multiple languages may also be needed to be categorized similarly.
Categorizing websites in Real-time:
Since the internet is so huge, it is practically impossible to categorize the entire list of websites present in it. So, when certain sites are being accessed by the user, the URL filtering systems categorize them ‘on the fly’ or in Real-time. This typically takes only a couple of hundred milliseconds and the local databases are automatically updated along with the central database.
This categorization is automatically done by learning machines (automated software applications like website crawlers) which retrieves the key pieces/keywords (or sometimes all the words) of the web site’s content and context to decide on the most appropriate category. Even the links from the websites to other sites is analyzed for placing it in the relevant category. These learning machines are trained by human professionals by feeding it with training data (which contains the websites categorized by human professionals) and adjusting its setting to reflect the same results, over a considerable period of time.
Human Intervention:
There are times when the learning machines are not able to classify websites and all such websites are categorized by human professionals, who actively participate in training them, analysing the results and abnormalities etc. Site submissions are also accepted from all the users, which is reviewed by professionals for classification (for the websites that are not already classified).
Advantages and Disadvantages of URL Filtering:
As mentioned earlier, URL filtering helps organizations improve productivity by making sure that employee time is not spent in unnecessary activities during office hours. These URL filtering can also help by preventing malicious code/spyware, phising etc. which may be potentially harmful to the organization. Some vendors also help block Peer-to-Peer software’s and Instant messaging which use more resources, wastes time and is also a security threat.
Over-blocking can cause issues with users (Example, some commercial spyware needs to be installed for certain applications to work and blocking them might deny access to those applications to the users). And over-blocking can also result in more help-desk tickets that need to be attended to, and resolved by the support team. If that happens frequently, then both the time of the user and the support team is utilized excessively. Sometimes, there is a problem with certain websites that have been already classified and then become threat sites/ avoidable sites at a later stage.
excITIngIP.com
You could stay up to date on the various computer networking technologies by subscribing to this blog with your email address in the sidebar box mentioned as “Get email updates when new articles are published”
This is my second visit to your site! I really enjoy your article and I believe I’ll become a frequent visitor to your site! I enjoy your in depth posts about blocking users from accessing unauthorized websites and I enjoy the fact you are so knowledgeable about blocking users from accessing unauthorized websites. Thanks for your time 🙂
Thanks, I guess all that knowledge is got from the internet 🙂
What rubbish…
You are just wasting the time of all readers.
The term “URL filtering” is more explanatory than the whole article.
Nice clear article
NOTHING……………. THIS WASTE TIME