When working with web scraping or content analysis, it’s essential to fetch the HTML content of a webpage. In this blog post, we’ll explore a simple C# program that demonstrates how to retrieve HTML from a specified URL and convert a hostname to its associated IP addresses.
1. Fetching HTML Content
The first part of our program focuses on fetching HTML content from a given URL. We use the HttpWebRequest
and HttpWebResponse
classes to send an HTTP request and receive the server’s response. The HTML content is then extracted from the response stream and displayed.
static string GetHtmlFromUrl(string url)
{
// Check if the URL is provided
if (string.IsNullOrEmpty(url))
throw new ArgumentNullException("url", "Parameter is null or empty");
// Initialize HTML content
string html = "";
try
{
// Generate and configure the HTTP request
HttpWebRequest request = GenerateHttpWebRequest(url);
// Get the response from the server
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
{
// Get the response stream
using (Stream responseStream = response.GetResponseStream())
{
// Read the HTML content from the response stream
using (StreamReader reader = new StreamReader(responseStream, Encoding.UTF8))
{
html = reader.ReadToEnd();
}
}
}
}
catch (Exception ex)
{
// Handle exceptions and provide error messages
html = $"Error retrieving HTML content: {ex.Message}";
}
return html;
}
2. Converting Hostname to IP Addresses
The second part of our program demonstrates how to convert a hostname to its associated IP addresses using the Dns.GetHostEntry
method. This function returns an IPHostEntry
object containing a list of IP addresses for the given hostname.
static string HostNameToIP(string hostname)
{
try
{
// Resolve the hostname into an IPHostEntry using the Dns class
IPHostEntry iphost = System.Net.Dns.GetHostEntry(hostname);
// Get all possible IP addresses for this hostname
IPAddress[] addresses = iphost.AddressList;
// Build a text representation of the IP addresses
StringBuilder addressList = new StringBuilder();
// Iterate through each IP address
foreach (IPAddress address in addresses)
{
addressList.AppendFormat("IP Address: {0};", address.ToString());
}
return addressList.ToString();
}
catch (Exception ex)
{
// Handle exceptions and provide error messages
return $"Error resolving hostname to IP: {ex.Message}";
}
}
3. Putting It All Together
In the Main
method, we showcase how to use these functions by fetching the HTML content from “http://www.google.com” and converting the hostname to IP addresses.
static void Main(string[] args)
{
string targetUrl = "http://www.google.com";
// Get and print the HTML content from the specified URL
string htmlContent = GetHtmlFromUrl(targetUrl);
Console.WriteLine(htmlContent);
// Convert the hostname to IP addresses and print them
string ipAddresses = HostNameToIP(targetUrl);
Console.WriteLine(ipAddresses);
Console.Read();
}
This simple C# program provides a foundation for more advanced web scraping and content analysis tasks. Feel free to customize and expand upon it based on your specific requirements.