mbox series

[0/2] net: phy: relax error checking when creating sysfs link netdev->phydev

Message ID 20180314222624.12744-1-grygorii.strashko@ti.com
Headers show
Series net: phy: relax error checking when creating sysfs link netdev->phydev | expand

Message

Grygorii Strashko March 14, 2018, 10:26 p.m. UTC
Some ethernet drivers (like TI CPSW) may connect and manage >1 Net PHYs per
one netdevice, as result such drivers will produce warning during system
boot and fail to connect second phy to netdevice when PHYLIB framework
will try to create sysfs link netdev->phydev for second PHY
in phy_attach_direct(), because sysfs link with the same name has been
created already for the first PHY.
As result, second CPSW external port will became unusable.
This issue was introduced by commits:
5568363f0cb3 ("net: phy: Create sysfs reciprocal links for attached_dev/phydev"
a3995460491d ("net: phy: Relax error checking on sysfs_create_link()"

Patch 1: exports sysfs_create_link_nowarn() function as preparation for Patch 2.
Patch 2: relaxes error checking when PHYLIB framework is creating sysfs
link netdev->phydev in phy_attach_direct(), suppress warning by using
sysfs_create_link_nowarn() and adds debug message instead.

This is stable material 4.13+.

Cc: Florian Fainelli <f.fainelli@gmail.com>
Grygorii Strashko (2):
  sysfs: symlink: export sysfs_create_link_nowarn()
  net: phy: relax error checking when creating sysfs link netdev->phydev

 drivers/net/phy/phy_device.c | 15 +++++++++++----
 fs/sysfs/symlink.c           |  1 +
 2 files changed, 12 insertions(+), 4 deletions(-)

Comments

Andrew Lunn March 16, 2018, 5:22 p.m. UTC | #1
On Wed, Mar 14, 2018 at 05:26:22PM -0500, Grygorii Strashko wrote:
> Some ethernet drivers (like TI CPSW) may connect and manage >1 Net PHYs per
> one netdevice, as result such drivers will produce warning during system
> boot and fail to connect second phy to netdevice when PHYLIB framework
> will try to create sysfs link netdev->phydev for second PHY
> in phy_attach_direct(), because sysfs link with the same name has been
> created already for the first PHY.
> As result, second CPSW external port will became unusable.
> This issue was introduced by commits:
> 5568363f0cb3 ("net: phy: Create sysfs reciprocal links for attached_dev/phydev"
> a3995460491d ("net: phy: Relax error checking on sysfs_create_link()"

I wonder if it would be better to add a flag to the phydev that
indicates it is the second PHY connected to a MAC? Add a bit to
phydrv->mdiodrv.flags. If that bit is set, don't create the sysfs
file.

For 99% of MAC drivers, having two PHYs is an error, so we want to aid
debug by reporting the sysfs error.

      Andrew
Florian Fainelli March 16, 2018, 5:34 p.m. UTC | #2
On 03/16/2018 10:22 AM, Andrew Lunn wrote:
> On Wed, Mar 14, 2018 at 05:26:22PM -0500, Grygorii Strashko wrote:
>> Some ethernet drivers (like TI CPSW) may connect and manage >1 Net PHYs per
>> one netdevice, as result such drivers will produce warning during system
>> boot and fail to connect second phy to netdevice when PHYLIB framework
>> will try to create sysfs link netdev->phydev for second PHY
>> in phy_attach_direct(), because sysfs link with the same name has been
>> created already for the first PHY.
>> As result, second CPSW external port will became unusable.
>> This issue was introduced by commits:
>> 5568363f0cb3 ("net: phy: Create sysfs reciprocal links for attached_dev/phydev"
>> a3995460491d ("net: phy: Relax error checking on sysfs_create_link()"
> 
> I wonder if it would be better to add a flag to the phydev that
> indicates it is the second PHY connected to a MAC? Add a bit to
> phydrv->mdiodrv.flags. If that bit is set, don't create the sysfs
> file.

We could indeed do that, I am fine with Grygorii's approach though in
making the creation more silent and non fatal.

> 
> For 99% of MAC drivers, having two PHYs is an error, so we want to aid
> debug by reporting the sysfs error.
That is true, either way is fine with me, really.
Grygorii Strashko March 16, 2018, 6:42 p.m. UTC | #3
On 03/16/2018 12:34 PM, Florian Fainelli wrote:
> 
> 
> On 03/16/2018 10:22 AM, Andrew Lunn wrote:
>> On Wed, Mar 14, 2018 at 05:26:22PM -0500, Grygorii Strashko wrote:
>>> Some ethernet drivers (like TI CPSW) may connect and manage >1 Net PHYs per
>>> one netdevice, as result such drivers will produce warning during system
>>> boot and fail to connect second phy to netdevice when PHYLIB framework
>>> will try to create sysfs link netdev->phydev for second PHY
>>> in phy_attach_direct(), because sysfs link with the same name has been
>>> created already for the first PHY.
>>> As result, second CPSW external port will became unusable.
>>> This issue was introduced by commits:
>>> 5568363f0cb3 ("net: phy: Create sysfs reciprocal links for attached_dev/phydev"
>>> a3995460491d ("net: phy: Relax error checking on sysfs_create_link()"
>>
>> I wonder if it would be better to add a flag to the phydev that
>> indicates it is the second PHY connected to a MAC? Add a bit to
>> phydrv->mdiodrv.flags. If that bit is set, don't create the sysfs
>> file.
> 
> We could indeed do that, I am fine with Grygorii's approach though in
> making the creation more silent and non fatal.

The link phydev->netdev still can be created. And failure to create links
is non fatal error in my opinion. 

> 
>>
>> For 99% of MAC drivers, having two PHYs is an error, so we want to aid
>> debug by reporting the sysfs error.
> That is true, either way is fine with me, really.
> 

Error still will be reported, just not warning and it will be non-fatal.
So, with this patch set it will be possible now to continue boot (NFS for example),
connect to the system and gather logs.
Florian Fainelli March 16, 2018, 7:11 p.m. UTC | #4
On March 16, 2018 11:42:21 AM PDT, Grygorii Strashko <grygorii.strashko@ti.com> wrote:
>
>
>On 03/16/2018 12:34 PM, Florian Fainelli wrote:
>> 
>> 
>> On 03/16/2018 10:22 AM, Andrew Lunn wrote:
>>> On Wed, Mar 14, 2018 at 05:26:22PM -0500, Grygorii Strashko wrote:
>>>> Some ethernet drivers (like TI CPSW) may connect and manage >1 Net
>PHYs per
>>>> one netdevice, as result such drivers will produce warning during
>system
>>>> boot and fail to connect second phy to netdevice when PHYLIB
>framework
>>>> will try to create sysfs link netdev->phydev for second PHY
>>>> in phy_attach_direct(), because sysfs link with the same name has
>been
>>>> created already for the first PHY.
>>>> As result, second CPSW external port will became unusable.
>>>> This issue was introduced by commits:
>>>> 5568363f0cb3 ("net: phy: Create sysfs reciprocal links for
>attached_dev/phydev"
>>>> a3995460491d ("net: phy: Relax error checking on
>sysfs_create_link()"
>>>
>>> I wonder if it would be better to add a flag to the phydev that
>>> indicates it is the second PHY connected to a MAC? Add a bit to
>>> phydrv->mdiodrv.flags. If that bit is set, don't create the sysfs
>>> file.
>> 
>> We could indeed do that, I am fine with Grygorii's approach though in
>> making the creation more silent and non fatal.
>
>The link phydev->netdev still can be created. And failure to create
>links
>is non fatal error in my opinion. 

They should not be fatal I agree, but it's nice to know when you are doing something wrong anyway.

>
>> 
>>>
>>> For 99% of MAC drivers, having two PHYs is an error, so we want to
>aid
>>> debug by reporting the sysfs error.
>> That is true, either way is fine with me, really.
>> 
>
>Error still will be reported, just not warning and it will be
>non-fatal.
>So, with this patch set it will be possible now to continue boot (NFS
>for example),
>connect to the system and gather logs.

The point Andrew is trying to make is that you address one particular failure in the PHY creation path when using > 1 PHY devices with a network device. Using a flag would easily allow us to be more future proof with other parts of PHYLIB  for your particular use case if that becomes necessary. This gives you less incentive to fix this use case though.
Grygorii Strashko March 16, 2018, 7:41 p.m. UTC | #5
On 03/16/2018 02:11 PM, Florian Fainelli wrote:
> On March 16, 2018 11:42:21 AM PDT, Grygorii Strashko <grygorii.strashko@ti.com> wrote:
>>
>>
>> On 03/16/2018 12:34 PM, Florian Fainelli wrote:
>>>
>>>
>>> On 03/16/2018 10:22 AM, Andrew Lunn wrote:
>>>> On Wed, Mar 14, 2018 at 05:26:22PM -0500, Grygorii Strashko wrote:
>>>>> Some ethernet drivers (like TI CPSW) may connect and manage >1 Net
>> PHYs per
>>>>> one netdevice, as result such drivers will produce warning during
>> system
>>>>> boot and fail to connect second phy to netdevice when PHYLIB
>> framework
>>>>> will try to create sysfs link netdev->phydev for second PHY
>>>>> in phy_attach_direct(), because sysfs link with the same name has
>> been
>>>>> created already for the first PHY.
>>>>> As result, second CPSW external port will became unusable.
>>>>> This issue was introduced by commits:
>>>>> 5568363f0cb3 ("net: phy: Create sysfs reciprocal links for
>> attached_dev/phydev"
>>>>> a3995460491d ("net: phy: Relax error checking on
>> sysfs_create_link()"
>>>>
>>>> I wonder if it would be better to add a flag to the phydev that
>>>> indicates it is the second PHY connected to a MAC? Add a bit to
>>>> phydrv->mdiodrv.flags. If that bit is set, don't create the sysfs
>>>> file.
>>>
>>> We could indeed do that, I am fine with Grygorii's approach though in
>>> making the creation more silent and non fatal.
>>
>> The link phydev->netdev still can be created. And failure to create
>> links
>> is non fatal error in my opinion.
> 
> They should not be fatal I agree, but it's nice to know when you are doing something wrong anyway.
> 
>>
>>>
>>>>
>>>> For 99% of MAC drivers, having two PHYs is an error, so we want to
>> aid
>>>> debug by reporting the sysfs error.
>>> That is true, either way is fine with me, really.
>>>
>>
>> Error still will be reported, just not warning and it will be
>> non-fatal.
>> So, with this patch set it will be possible now to continue boot (NFS
>> for example),
>> connect to the system and gather logs.
> 
> The point Andrew is trying to make is that you address one particular failure in the PHY creation path when using >
 1 PHY devices with a network device. Using a flag would easily allow us to be more future proof with other parts of PHYLIB
  for your particular use case if that becomes necessary. This gives you less incentive to fix this use case though.
> 

That's true, I'm fixing use case with >1 and I'll try to re-implement using flag as requested.
But note, this patch in its current form fixes 1:1 (phydev:netdev) use case also (at least as i understand it),
because current code will just kill net connection if create sysfs link fails, so in case of net boot -
failure logs will not be accessible without direct access to the device.

Actually, how can i pass this flag "<name of the flag>" from CPSW to of_phy_connect()->phy_attach_direct()?
The parameter "flags" == phy_device->dev_flags is used to pass PHY driver's specific options, so can't be used.

The phydrv->mdiodrv.flags can be accessible only after call to of_phy_connect()/phy_connect(), 
but sysfs links are created inside these functions.

Thanks.
Andrew Lunn March 16, 2018, 7:54 p.m. UTC | #6
> The phydrv->mdiodrv.flags can be accessible only after call to of_phy_connect()/phy_connect(), 

You need to use a function like of_phy_find_device() to get the
phydev, set the flag, and then call phy_connect_direct(). 

	Andrew
Grygorii Strashko March 16, 2018, 8:13 p.m. UTC | #7
On 03/16/2018 02:54 PM, Andrew Lunn wrote:
>> The phydrv->mdiodrv.flags can be accessible only after call to of_phy_connect()/phy_connect(),
> 
> You need to use a function like of_phy_find_device() to get the
> phydev, set the flag, and then call phy_connect_direct().


So, do you propose me to replace direct calls of of_phy_connect()/phy_connect() in
CPSW driver with buddies of the same functions? Right?

cpsw_slave_open()
{
....
	if (slave->data->phy_node) {
		phy = of_phy_connect(priv->ndev, slave->data->phy_node,
				 &cpsw_adjust_link, 0, slave->data->phy_if);
----- replace ^^^^ with below
{
	struct phy_device *phy = of_phy_find_device(phy_np);
	int ret;

	if (!phy)
		return NULL;

	phy->dev_flags = flags;

-----	[set flag in phydrv->mdiodrv.flags]

	ret = phy_connect_direct(dev, phy, hndlr, iface);

	/* refcount is held by phy_connect_direct() on success */
	put_device(&phy->mdio.dev);

	return ret ? NULL : phy;
}
-----
		if (!phy) {
			dev_err(priv->dev, "phy \"%pOF\" not found on slave %d\n",
				slave->data->phy_node,
				slave->slave_num);
			return;
		}
	} else {
		phy = phy_connect(priv->ndev, slave->data->phy_id,
				 &cpsw_adjust_link, slave->data->phy_if);
----- replace ^^^^ with below
{
	struct phy_device *phydev;
	struct device *d;
	int rc;

	/* Search the list of PHY devices on the mdio bus for the
	 * PHY with the requested name
	 */
	d = bus_find_device_by_name(&mdio_bus_type, NULL, bus_id);
	if (!d) {
		pr_err("PHY %s not found\n", bus_id);
		return ERR_PTR(-ENODEV);
	}
	phydev = to_phy_device(d);

-----	[set flag in phydrv->mdiodrv.flags]

	rc = phy_connect_direct(dev, phydev, handler, interface);
	put_device(d);
	if (rc)
		return ERR_PTR(rc);

	return phydev;
}
-----
		if (IS_ERR(phy)) {
			dev_err(priv->dev,
				"phy \"%s\" not found on slave %d, err %ld\n",
				slave->data->phy_id, slave->slave_num,
				PTR_ERR(phy));
			return;
		}
	}
}

and all above just to set a flag which will be used by just one driver as of now.

Hm. Is this some sort of punishment ;) Sry. I'll probably will take a pause.
Florian Fainelli March 16, 2018, 9:09 p.m. UTC | #8
On 03/16/2018 01:13 PM, Grygorii Strashko wrote:
> 
> 
> On 03/16/2018 02:54 PM, Andrew Lunn wrote:
>>> The phydrv->mdiodrv.flags can be accessible only after call to of_phy_connect()/phy_connect(),
>>
>> You need to use a function like of_phy_find_device() to get the
>> phydev, set the flag, and then call phy_connect_direct().
> 
> 
> So, do you propose me to replace direct calls of of_phy_connect()/phy_connect() in
> CPSW driver with buddies of the same functions? Right?
> 
> cpsw_slave_open()
> {
> ....
> 	if (slave->data->phy_node) {
> 		phy = of_phy_connect(priv->ndev, slave->data->phy_node,
> 				 &cpsw_adjust_link, 0, slave->data->phy_if);
> ----- replace ^^^^ with below
> {
> 	struct phy_device *phy = of_phy_find_device(phy_np);
> 	int ret;
> 
> 	if (!phy)
> 		return NULL;
> 
> 	phy->dev_flags = flags;
> 
> -----	[set flag in phydrv->mdiodrv.flags]
> 
> 	ret = phy_connect_direct(dev, phy, hndlr, iface);
> 
> 	/* refcount is held by phy_connect_direct() on success */
> 	put_device(&phy->mdio.dev);
> 
> 	return ret ? NULL : phy;
> }
> -----
> 		if (!phy) {
> 			dev_err(priv->dev, "phy \"%pOF\" not found on slave %d\n",
> 				slave->data->phy_node,
> 				slave->slave_num);
> 			return;
> 		}
> 	} else {
> 		phy = phy_connect(priv->ndev, slave->data->phy_id,
> 				 &cpsw_adjust_link, slave->data->phy_if);
> ----- replace ^^^^ with below
> {
> 	struct phy_device *phydev;
> 	struct device *d;
> 	int rc;
> 
> 	/* Search the list of PHY devices on the mdio bus for the
> 	 * PHY with the requested name
> 	 */
> 	d = bus_find_device_by_name(&mdio_bus_type, NULL, bus_id);
> 	if (!d) {
> 		pr_err("PHY %s not found\n", bus_id);
> 		return ERR_PTR(-ENODEV);
> 	}
> 	phydev = to_phy_device(d);
> 
> -----	[set flag in phydrv->mdiodrv.flags]
> 
> 	rc = phy_connect_direct(dev, phydev, handler, interface);
> 	put_device(d);
> 	if (rc)
> 		return ERR_PTR(rc);
> 
> 	return phydev;
> }
> -----
> 		if (IS_ERR(phy)) {
> 			dev_err(priv->dev,
> 				"phy \"%s\" not found on slave %d, err %ld\n",
> 				slave->data->phy_id, slave->slave_num,
> 				PTR_ERR(phy));
> 			return;
> 		}
> 	}
> }
> 
> and all above just to set a flag which will be used by just one driver as of now.
> 
> Hm. Is this some sort of punishment ;) Sry. I'll probably will take a pause.

I agree, let's not have you run into circles, let's just use your
patches as they are since they fix the problem and are not intrusive in
any way.
Andrew Lunn March 16, 2018, 9:14 p.m. UTC | #9
> I agree, let's not have you run into circles, let's just use your
> patches as they are since they fix the problem and are not intrusive in
> any way.

Agreed, this is too complex, for little gain.

	Andrew
Grygorii Strashko March 16, 2018, 10:08 p.m. UTC | #10
On 03/16/2018 04:14 PM, Andrew Lunn wrote:
>> I agree, let's not have you run into circles, let's just use your
>> patches as they are since they fix the problem and are not intrusive in
>> any way.
> 
> Agreed, this is too complex, for little gain.
> 

Thanks. v2 posted.