Archive for January, 2012

Get more elaborate XML validation errors with PHP

Posted in PHP, XML on January 23rd, 2012 by Joonas – 1 Comment

In a previous post, “Validate XML file against an XSD schema“, I showed how you validate an XML document against a corresponding XSD document. By default, the DOMDocument class does not spit out very elaborate debug messages when a file fails validation.

You can use PHP’s libxml functions to enable advanced errors that will tell you exactly what the problem is and what line it is on, etc. Check out the code below for a simple use example.

# Enable advanced libxml errors
libxml_use_internal_errors(true);
 
$filename = 'your-input-file.xml';
$xsd_document = 'your-schema-document.xsd';
$dom = new DomDocument();
 
if (!$dom->load($filename)) die('Could not load XML file: '.$filename);
if (!$dom->schemaValidate($xsd_document))
{
    print_r(libxml_get_errors());
    die('XML file did not validate against schema: '.$xsd_document);
}

The advanced errors will help you spot errors in XML faster and be able to provide better reporting on what is going on with the failures of what ever your implementation may be.

There are some other cool things that you can do with the libxml functions as well.

Convert bad characters to UTF-8 in an XML file with PHP

Posted in PHP, XML on January 12th, 2012 by Joonas – Be the first to comment

I ran in to a situation where a vendor was sending an XML file to my system and I needed to parse it and do a bunch of stuff with the data. Well, I wanted to load the file with PHP DOMDocument, but as soon as I did, the following error started to cause grief.

Warning: DOMDocument::load(): Input is not proper UTF-8, indicate encoding !

The error happens because there are non-UTF-8 characters in the XML file. The vendor was writing a bad characters to the file. In an ideal world, the problem should have been fixed upstream by the vendor, but that was out of the question. I came up with the following solution.

The solution is to search the file for non-UTF-8 characters and convert them to UTF-8 with PHP’s utf8_encode function. See the sample code below.

<?php
 
# The XML file with bad characters
$filename = "sample_xml_file.xml";
 
# Read file contents to a variable
$contents = file_get_contents($filename);
 
# Find the bad characters
preg_match_all('/[^(\x20-\x7F)]*/', $contents, $badchars);
 
# Process bad characters if some were found
if(isset($badchars[0]))
{
        # Narrow down the results to uniques only
        $badchars[0] = array_unique($badchars[0]);
 
        # Replace the bad characters with their UTF8 equivalents
        foreach($badchars[0] as $badchar)
        {
                $contents = preg_replace("/".$badchar."/", utf8_encode($badchar), $contents);
        }
}
 
# Write the fixed contents back to the file
file_put_contents($filename, $contents);
 
# Cleanup
unset($contents);
 
# Now the bad characters have been encoded to UTF8
# It will now load file with DOMDocument
$dom = new DOMDocument();
$dom->load($filename);
 
?>

This solution worked for all the bad iso-8859-1 characters that I have run in to, which were mostly control characters, but if it doesn’t work for you and/or you have a different approach, please leave a comment!

Export data from a database to CSV / Excel with PHP

Posted in PHP on January 9th, 2012 by Joonas – 2 Comments

Let’s assume that you already have a database setup with a table called “users” and you want to export users and their email addresses to a CSV file. PHP has built in functions to deal with CSV files.

“CSV is a simple file format that is widely supported by consumer, business, and scientific applications. Among its most common uses is to move tabular data between programs that naturally operate on a more efficient or complete proprietary format. For example: a CSV file might be used to transfer information from a database program to a spreadsheet.” – Wikipedia

The code to connect to a MySQL database, query for users, create a file, and write CSV data to the file is provided below.

<?php
 
// Connect and query the database for the users
$conn = new PDO("mysql:host=localhost;dbname=mydatabase", 'myuser', 'mypassword');
$sql = "SELECT username, email FROM users ORDER BY username";
$results = $conn->query($sql);
 
// Pick a filename and destination directory for the file
// Remember that the folder where you want to write the file has to be writable
$filename = "/tmp/db_user_export_".time().".csv";
 
// Actually create the file
// The w+ parameter will wipe out and overwrite any existing file with the same name
$handle = fopen($filename, 'w+');
 
// Write the spreadsheet column titles / labels
fputcsv($handle, array('Username','Email'));
 
// Write all the user records to the spreadsheet
foreach($results as $row)
{
    fputcsv($handle, array($row['username'], $row['email']));
}
 
// Finish writing the file
fclose($handle);
 
?>

The purpose of this post is not to demonstrate how to connect to a database and/or query it for contents, nor how to loop through query results, so I will not explain how PHP’s PDO database abstraction class works. The above code demonstrates how you create a file and write CSV data to it, so let’s walk through the above code.

First, the script connects to a MySQL database using PHP’s PDO class and queries the users table for all the users sorted by username. It then generates a filename with the current UNIX timestamp appended at the end. In this example, the file will be written to the server’s /tmp/ directory. You can change the path to what ever directory you want as long as you have write access to it.

The code then opens the file connection, which creates the file and overwrites any existing file. Once the file is created, the column labels are written to the file. In this example, we are only writing two columns, a Username and an Email column. Next comes the main loop that iterates through all the users that our SQL query got from the database. The loop writes the username and the email of each user record. Finally, we close the file.

What you do with the CSV file beyond this point is up to you and your requirements.

I will write another post later on how to create an actual Excel file with an XLS/XLSX extension.

How to disable PHP open_basedir in PLESK 10

Posted in PHP, PLESK on January 5th, 2012 by Joonas – 2 Comments

PLESK 10 comes with open_basedir enabled by default. I ran in to an instance where I was dynamically creating new vhosts from time to time and could not be bothered to disable open_basedir on a case by case basis by creating a custom vhost.conf for every domain that I created. I needed open_basedir to be turned off server wide by default. The solution was to create a new config file that apache includes by default.

I did this on MediaTemple hosting using CentOS 5 and PLESK 10, but the solution should be the same or very similar on other setups as well.

First, login to your server as root and run the following command.

vim /etc/httpd/conf.d/remove_open_basedir.conf

Second, paste the following code in to the new file.

<DirectoryMatch /var/www/vhosts/(.*)/httpdocs/>
    php_admin_value open_basedir none
</DirectoryMatch>

Third, reconfigure all domains and restart apache.

/usr/local/psa/admin/sbin/httpdmng --reconfigure-all
/etc/init.d/httpd restart

Done! Now you should notice that open_basedir is disabled for all domains.

Please leave comments if this does not work for you or if the implementation on another version of Linux is different.

Do you want to disable open_basedir only for specific vhosts / domains?

Disabling open_basedir or setting it to something specific can be done painlessly on a per domain basis in PLESK. See the steps below.

First, create (or edit as necessary) a vhost specific config file.

vim /var/www/vhosts/example.com/conf/vhost.conf

Second, disable open_basedir by pasting in the code below.

<directory /var/www/vhosts/example.com/httpdocs>
php_admin_value open_basedir none
</directory>

Third, reconfigure your vhost files and restart apache.

/usr/local/psa/admin/sbin/httpdmng --reconfigure-all
/etc/init.d/httpd restart

Enjoy.