Sounds simple right? Well, listing directory contents is a core task for most programming languages. But there are a few gotchas to bear in mind when creating a PHP directory list class. First of all you need to decide whether you need to recurse – that is, follow subdirectories. If so, you must be careful to avoid an infinite recursion. If you want to use a child class to extend the base functionality then you’ll need to provide a mechanism for that class to control listing behaviour. In this post, I’ll put a simple example together.
Old school or new school
When dealing with directories you can choose to use core PHP functions such as opendir
and readdir
or you can use the newer SPL tool DirectoryIterator. This latter approach is the cleaner way to go for several reasons:
DirectoryIterator
internalises the steps involved in acquiring a directory resource.DirectoryIterator
extendsIterator
– which means you can list the contents of a file like an array – withforeach
DirectoryIterator
is object-oriented and provides easy access to common operations and tests you might want to perform on files and directories
Listing the contents of a directory with DirectoryIterator
So let’s get started. Generating a PHP directory list is as simple as instantiating a DirectoryIterator
object and looping through it:
$dir = "/home/mattz";
$lister = new \DirectoryIterator($dir);
foreach ($lister as $item) {
print "$item\n";
}
Why the backslash in front of DirectoryLister
? Well best practice requires that we put most of our code under a namespace. We use the backslash to ensure that the SPL DirectoryLister
class in the global namespace is instantiated and not a local class of the same name. Note also that I directly printed the $item
variable even though it contained an instance of DirectoryIterator
. I was able to do this because the DirectoryIterator
object has a __toString()
method which will return a string when invoked in a string context.
Compare the previous fragment with the old way of achieving the same thing:
$dir = "/home/mattz";
$dh = opendir($dir);
while (($item = readdir($dh)) !== false) {
print "$item\n";
}
closedir($dh);
Much less clean and easy.
Testing for dot directories
This is essential if we are to go recursive because “.” represents the current directory and “..” represents the parent directory. These appear in all Unix directory listings. If my PHP directory list code were to follow these directories it would never stop recursing – and the script would soon blow up. DirectoryIterator
provides a simple method to check this.
$dir = "/home/mattz";
$lister = new \DirectoryIterator($dir);
foreach ($lister as $item) {
if ($lister->isDot()) {
print "ignoring the dot!\n";
continue;
}
print "$item\n";
}
isDot()
tests that the item is both a directory and one of “.” and “..”. Using the older functions, I must perform those tests for myself
$dir = "/home/mattz";
$dh = opendir($dir);
$s = DIRECTORY_SEPARATOR;
while (($item = readdir($dh)) !== false) {
if (is_dir("{$dir}{$s}{$item}") && ($item == "." || $item == "..")) {
print "ignoring the dot!\n";
continue;
}
print "$item\n";
}
closedir($dh);
Testing for symlinks
This is another issue that can cause problems when exploring a directory structure. If we follow symlinks (alias directories) we can find ourselves moving unexpectedly into new parts of the filesystem – leading to some unexpected or even dangerous results. So by default I’m going to turn off symlink following. As you might expect, DirectoryIterator
has a handy method to achieve this:
$dir = "/home/mattz";
$lister = new \DirectoryIterator($dir);
foreach ($lister as $item) {
if (
$item->isDot() ||
($item->isDir() && $item->isLink())
) {
print "($item) ignoring the dot or the symlink dir!\n";
continue;
}
print "$item\n";
}
We only want to ignore symlinks to directories so we employ two tests here: isDir()
and isLink()
.
Going old school, I can use the is_link
function.
$dir = "/home/mattz";
$lister = new \DirectoryIterator($dir);
$dh = opendir($dir);
$s = DIRECTORY_SEPARATOR;
while (($item = readdir($dh)) !== false) {
$path = "{$dir}{$s}{$item}";
if (is_dir($path) &&
(
($item == "." || $item == "..") ||
is_link($path)
)
) {
print "ignoring the dot or the symlink dir!\n";
continue;
}
print "$item\n";
}
closedir($dh);
Putting it together – a PHP directory list class
I’m going to place my PHP directory list functionality into a class for two reasons. Firstly, a class will allow me to save state (if I want to compile a filtered list, for example). Secondly, a class will be easy to extend for future uses.
Here goes:
class Lister
{
public function listdir(\DirectoryIterator $iterator)
{
foreach ($iterator as $file) {
if ($file->isDot() || ($file->isDir() && $file->isLink())) {
continue;
}
if ($file->isDir()) {
if ($this->handleDir($file)) {
$this->listdir(new DirectoryIterator($file->getPathName()));
}
continue;
}
if (! $this->handleFile($file)) {
// no further iteration of this directory
return;
}
}
}
protected function handleDir(\DirectoryIterator $it)
{
print "$it\n";
return true;
}
protected function handleFile(\DirectoryIterator $it)
{
print "$it\n";
return true;
}
}
So the only new DirectoryIterator
piece here is getPathName()
. That returns the full path of the current file or directory. I use that to create a new DirectoryIterator
object which I pass to the listdir()
method all over again. In this way, when I encounter a directory that is not a symlink or one of .
or ..
I jump down and start the process all over again.
There is an exception to this. I only make this recursive call if a method named handleDir()
, which I call first, returns true
. Similarly, I call handleFile()
for each file I encounter in the directory listing. If handleFile()
does not return true
. I abort the listing in the current directory. Since these methods are hardcoded to return true
, this might seem redundant. I’ll show you how it can be made useful in the next section.
First, though, here’s the old school version of this code:
class Lister
{
public function listdir($dir)
{
$dh = opendir($dir);
$s = DIRECTORY_SEPARATOR;
while (($item = readdir($dh)) !== false) {
$path = "{$dir}{$s}{$item}";
if (is_dir($path) &&
(
($item == "." || $item == "..") ||
is_link($path)
)
) {
continue;
}
if (is_dir($path)) {
if ($this->handleDir($path)) {
$this->listdir($path);
}
}
if (! $this->handleFile($path)) {
// no further iteration of this directory
return;
}
}
closedir($dh);
}
protected function handleDir($dir)
{
print "{$dir}\n";
return true;
}
protected function handleFile($file)
{
print "{$file}\n";
return true;
}
}
As you can see, things are beginning to get clunky – and it will only get worse as you begin to work more with the files and paths. The SPL classes exist to encapsulate complexity – which usually means cleaner, more elegant code.
Using the lister: the A game
In common with most most object-oriented coders, I hate duplication. Duplicated code is inelegant and it can cause problems over time. With duplications in your system you have to remember to fix bugs and add features in every place the code block is repeated. Before you know it, parts of your system fall out of alignment with others, and things spin out of control like a blaster-clipped tie fighter. By creating a single parent class with common functionality, you define core functionality only once.
My PHP directory list class – Lister
– is designed to be overridden in this way. By creating a child class and then overriding handleDir()
and handleFile()
you can do what you like with the directories and files the parent class traverses for you. Here, for example, is AllTheAs
a simple PHP director list class that’s designed only to navigate and print items that begin with the letter ‘a’.
class AllTheAs extends Lister
{
protected function handleDir(\DirectoryIterator $it)
{
if (strpos($it->getFileName(), "a") === 0) {
print "$it\n";
return true;
}
return false;
}
protected function handleFile(\DirectoryIterator $it)
{
if (strpos($it->getFileName(), "a") === 0) {
print "$it\n";
}
return true;
}
}
Because handleDir()
only returns true
when the provided directory begins with the letter ‘a’, only such directories will be traversed. handleFile()
always returns true
because I don’t want to terminate listing within a directory – but it only outputs matching file names.
Here’s the code I use to call it:
$lister = new AllTheAs();
$lister->listdir(new \DirectoryIterator("/home/mattz/Dropbox"));